Changelog in Linux kernel 6.6.57

ALSA: hda/realtek: cs35l41: Fix device ID / model name [+ + +]

Author: Jean-Loïc Charroud <[email protected]>
Date:   Wed Feb 14 00:42:12 2024 +0100

    ALSA: hda/realtek: cs35l41: Fix device ID / model name
    
    [ Upstream commit b91050448897663b60b6d15525c8c3ecae28a368 ]
    
    The patch 51d976079976c800ef19ed1b542602fcf63f0edb ("ALSA: hda/realtek:
    Add quirks for ASUS Zenbook 2022 Models") modified the entry 1043:1e2e
    from "ASUS UM3402" to "ASUS UM6702RA/RC" and added another entry for
    "ASUS UM3402" with 104e:1ee2.
    The first entry was correct, while the new one corresponds to model
    "ASUS UM6702RA/RC"
    Fix the model names for both devices.
    
    Fixes: 51d976079976 ("ALSA: hda/realtek: Add quirks for ASUS Zenbook 2022 Models")
    Signed-off-by: Jean-Loïc Charroud <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ALSA: hda/realtek: cs35l41: Fix order and duplicates in quirks table [+ + +]

Author: Jean-Loïc Charroud <[email protected]>
Date:   Wed Feb 14 00:44:24 2024 +0100

    ALSA: hda/realtek: cs35l41: Fix order and duplicates in quirks table
    
    [ Upstream commit 852d432a14dbcd34e15a3a3910c5c6869a6d1929 ]
    
    Move entry {0x1043, 0x16a3, "ASUS UX3402VA"} following device ID order.
    Remove duplicate entry for device {0x1043, 0x1f62, "ASUS UX7602ZM"}.
    
    Fixes: 51d976079976 ("ALSA: hda/realtek: Add quirks for ASUS Zenbook 2022 Models")
    Signed-off-by: Jean-Loïc Charroud <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: cs35l56: Load tunings for the correct speaker models [+ + +]

Author: Richard Fitzgerald <[email protected]>
Date:   Mon Jan 29 16:27:32 2024 +0000

    ASoC: cs35l56: Load tunings for the correct speaker models
    
    [ Upstream commit 245eeff18d7a37693815250ae15979ce98c3d190 ]
    
    If the "spk-id-gpios" property is present it points to GPIOs whose
    value must be used to select the correct bin file to match the
    speakers.
    
    Some manufacturers use multiple sources of speakers, which need
    different tunings for best performance. On these models the type of
    speaker fitted is indicated by the values of one or more GPIOs. The
    number formed by the GPIOs identifies the tuning required.
    
    The speaker ID must be used in combination with the subsystem ID
    (either from PCI SSID or cirrus,firmware-uid property), because the
    GPIOs can only indicate variants of a specific model.
    
    Signed-off-by: Richard Fitzgerald <[email protected]>
    Fixes: 1a1c3d794ef6 ("ASoC: cs35l56: Use PCI SSID as the firmware UID")
    Link: https://msgid.link/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ASoC: tas2781: mark dvc_tlv with __maybe_unused [+ + +]

Author: Gergo Koteles <[email protected]>
Date:   Thu Mar 28 23:47:37 2024 +0100

    ASoC: tas2781: mark dvc_tlv with __maybe_unused
    
    [ Upstream commit 831ec5e3538e989c7995137b5c5c661991a09504 ]
    
    Since we put dvc_tlv static variable to a header file it's copied to
    each module that includes the header. But not all of them are actually
    used it.
    
    Fix this W=1 build warning:
    
    include/sound/tas2781-tlv.h:18:35: warning: 'dvc_tlv' defined but not
    used [-Wunused-const-variable=]
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Fixes: ae065d0ce9e3 ("ALSA: hda/tas2781: remove digital gain kcontrol")
    Signed-off-by: Gergo Koteles <[email protected]>
    Message-ID: <0e461545a2a6e9b6152985143e50526322e5f76b.1711665731.git.soyer@irl.hu>
    Signed-off-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ata: ahci: Add mask_port_map module parameter [+ + +]

Author: Damien Le Moal <[email protected]>
Date:   Thu Apr 4 18:30:14 2024 +0900

    ata: ahci: Add mask_port_map module parameter
    
    [ Upstream commit 24cfd86433c920188ac3f02df8aba6bc4c792f4b ]
    
    Commits 0077a504e1a4 ("ahci: asm1166: correct count of reported ports")
    and 9815e3961754 ("ahci: asm1064: correct count of reported ports")
    attempted to limit the ports of the ASM1166 and ASM1064 AHCI controllers
    to avoid long boot times caused by the fact that these adapters report
    a port map larger than the number of physical ports. The excess ports
    are "virtual" to hide port multiplier devices and probing these ports
    takes time. However, these commits caused a regression for users that do
    use PMP devices, as the ATA devices connected to the PMP cannot be
    scanned. These commits have thus been reverted by commit 6cd8adc3e18
    ("ahci: asm1064: asm1166: don't limit reported ports") to allow the
    discovery of devices connected through a port multiplier. But this
    revert re-introduced the long boot times for users that do not use a
    port multiplier setup.
    
    This patch adds the mask_port_map ahci module parameter to allow users
    to manually specify port map masks for controllers. In the case of the
    ASMedia 1166 and 1064 controllers, users that do not have port
    multiplier devices can mask the excess virtual ports exposed by the
    controller to speedup port scanning, thus reducing boot time.
    
    The mask_port_map parameter accepts 2 different formats:
     - mask_port_map=<mask>
       This applies the same mask to all AHCI controllers
       present in the system. This format is convenient for small systems
       that have only a single AHCI controller.
     - mask_port_map=<pci_dev>=<mask>,<pci_dev>=mask,...
       This applies the specified masks only to the PCI device listed. The
       <pci_dev> field is a regular PCI device ID (domain:bus:dev.func).
       This ID can be seen following "ahci" in the kernel messages. E.g.
       for "ahci 0000:01:00.0: 2/2 ports implemented (port mask 0x3)", the
       <pci_dev> field is "0000:01:00.0".
    
    When used, the function ahci_save_initial_config() indicates that a
    port map mask was applied with the message "masking port_map ...".
    E.g.: without a mask:
    modprobe ahci
    dmesg | grep ahci
    ...
    ahci 0000:00:17.0: AHCI vers 0001.0301, 32 command slots, 6 Gbps, SATA mode
    ahci 0000:00:17.0: (0000:00:17.0) 8/8 ports implemented (port mask 0xff)
    
    With a mask:
    modprobe ahci mask_port_map=0000:00:17.0=0x1
    dmesg | grep ahci
    ...
    ahci 0000:00:17.0: masking port_map 0xff -> 0x1
    ahci 0000:00:17.0: AHCI vers 0001.0301, 32 command slots, 6 Gbps, SATA mode
    ahci 0000:00:17.0: (0000:00:17.0) 1/8 ports implemented (port mask 0x1)
    
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Niklas Cassel <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ata: libata: avoid superfluous disk spin down + spin up during hibernation [+ + +]

Author: Niklas Cassel <[email protected]>
Date:   Tue Oct 8 15:58:44 2024 +0200

    ata: libata: avoid superfluous disk spin down + spin up during hibernation
    
    commit a38719e3157118428e34fbd45b0d0707a5877784 upstream.
    
    A user reported that commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi
    device manage_system_start_stop") introduced a spin down + immediate spin
    up of the disk both when entering and when resuming from hibernation.
    This behavior was not there before, and causes an increased latency both
    when entering and when resuming from hibernation.
    
    Hibernation is done by three consecutive PM events, in the following order:
    1) PM_EVENT_FREEZE
    2) PM_EVENT_THAW
    3) PM_EVENT_HIBERNATE
    
    Commit aa3998dbeb3a ("ata: libata-scsi: Disable scsi device
    manage_system_start_stop") modified ata_eh_handle_port_suspend() to call
    ata_dev_power_set_standby() (which spins down the disk), for both event
    PM_EVENT_FREEZE and event PM_EVENT_HIBERNATE.
    
    Documentation/driver-api/pm/devices.rst, section "Entering Hibernation",
    explicitly mentions that PM_EVENT_FREEZE does not have to be put the device
    in a low-power state, and actually recommends not doing so. Thus, let's not
    spin down the disk on PM_EVENT_FREEZE. (The disk will instead be spun down
    during the subsequent PM_EVENT_HIBERNATE event.)
    
    This way, PM_EVENT_FREEZE will behave as it did before commit aa3998dbeb3a
    ("ata: libata-scsi: Disable scsi device manage_system_start_stop"), while
    PM_EVENT_HIBERNATE will continue to spin down the disk.
    
    This will avoid the superfluous spin down + spin up when entering and
    resuming from hibernation, while still making sure that the disk is spun
    down before actually entering hibernation.
    
    Cc: [email protected] # v6.6+
    Fixes: aa3998dbeb3a ("ata: libata-scsi: Disable scsi device manage_system_start_stop")
    Reviewed-by: Damien Le Moal <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Niklas Cassel <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: Fix usage of __hci_cmd_sync_status [+ + +]

Author: Luiz Augusto von Dentz <[email protected]>
Date:   Mon Jul 1 12:07:46 2024 -0400

    Bluetooth: Fix usage of __hci_cmd_sync_status
    
    [ Upstream commit 87be7b189b2c50d4b51512f59e4e97db4eedee8a ]
    
    __hci_cmd_sync_status shall only be used if hci_req_sync_lock is _not_
    required which is not the case of hci_dev_cmd so it needs to use
    hci_cmd_sync_status which uses hci_req_sync_lock internally.
    
    Fixes: f1a8f402f13f ("Bluetooth: L2CAP: Fix deadlock")
    Reported-by: Pauli Virtanen <[email protected]>
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Bluetooth: hci_conn: Fix UAF in hci_enhanced_setup_sync [+ + +]

Author: Luiz Augusto von Dentz <[email protected]>
Date:   Wed Oct 2 11:17:26 2024 -0400

    Bluetooth: hci_conn: Fix UAF in hci_enhanced_setup_sync
    
    commit 18fd04ad856df07733f5bb07e7f7168e7443d393 upstream.
    
    This checks if the ACL connection remains valid as it could be destroyed
    while hci_enhanced_setup_sync is pending on cmd_sync leading to the
    following trace:
    
    BUG: KASAN: slab-use-after-free in hci_enhanced_setup_sync+0x91b/0xa60
    Read of size 1 at addr ffff888002328ffd by task kworker/u5:2/37
    
    CPU: 0 UID: 0 PID: 37 Comm: kworker/u5:2 Not tainted 6.11.0-rc6-01300-g810be445d8d6 #7099
    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
    Workqueue: hci0 hci_cmd_sync_work
    Call Trace:
     <TASK>
     dump_stack_lvl+0x5d/0x80
     ? hci_enhanced_setup_sync+0x91b/0xa60
     print_report+0x152/0x4c0
     ? hci_enhanced_setup_sync+0x91b/0xa60
     ? __virt_addr_valid+0x1fa/0x420
     ? hci_enhanced_setup_sync+0x91b/0xa60
     kasan_report+0xda/0x1b0
     ? hci_enhanced_setup_sync+0x91b/0xa60
     hci_enhanced_setup_sync+0x91b/0xa60
     ? __pfx_hci_enhanced_setup_sync+0x10/0x10
     ? __pfx___mutex_lock+0x10/0x10
     hci_cmd_sync_work+0x1c2/0x330
     process_one_work+0x7d9/0x1360
     ? __pfx_lock_acquire+0x10/0x10
     ? __pfx_process_one_work+0x10/0x10
     ? assign_work+0x167/0x240
     worker_thread+0x5b7/0xf60
     ? __kthread_parkme+0xac/0x1c0
     ? __pfx_worker_thread+0x10/0x10
     ? __pfx_worker_thread+0x10/0x10
     kthread+0x293/0x360
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0x2f/0x70
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
     </TASK>
    
    Allocated by task 34:
     kasan_save_stack+0x30/0x50
     kasan_save_track+0x14/0x30
     __kasan_kmalloc+0x8f/0xa0
     __hci_conn_add+0x187/0x17d0
     hci_connect_sco+0x2e1/0xb90
     sco_sock_connect+0x2a2/0xb80
     __sys_connect+0x227/0x2a0
     __x64_sys_connect+0x6d/0xb0
     do_syscall_64+0x71/0x140
     entry_SYSCALL_64_after_hwframe+0x76/0x7e
    
    Freed by task 37:
     kasan_save_stack+0x30/0x50
     kasan_save_track+0x14/0x30
     kasan_save_free_info+0x3b/0x60
     __kasan_slab_free+0x101/0x160
     kfree+0xd0/0x250
     device_release+0x9a/0x210
     kobject_put+0x151/0x280
     hci_conn_del+0x448/0xbf0
     hci_abort_conn_sync+0x46f/0x980
     hci_cmd_sync_work+0x1c2/0x330
     process_one_work+0x7d9/0x1360
     worker_thread+0x5b7/0xf60
     kthread+0x293/0x360
     ret_from_fork+0x2f/0x70
     ret_from_fork_asm+0x1a/0x30
    
    Cc: [email protected]
    Fixes: e07a06b4eb41 ("Bluetooth: Convert SCO configure_datapath to hci_sync")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

Bluetooth: RFCOMM: FIX possible deadlock in rfcomm_sk_state_change [+ + +]

Author: Luiz Augusto von Dentz <[email protected]>
Date:   Mon Sep 30 13:26:21 2024 -0400

    Bluetooth: RFCOMM: FIX possible deadlock in rfcomm_sk_state_change
    
    [ Upstream commit 08d1914293dae38350b8088980e59fbc699a72fe ]
    
    rfcomm_sk_state_change attempts to use sock_lock so it must never be
    called with it locked but rfcomm_sock_ioctl always attempt to lock it
    causing the following trace:
    
    ======================================================
    WARNING: possible circular locking dependency detected
    6.8.0-syzkaller-08951-gfe46a7dd189e #0 Not tainted
    ------------------------------------------------------
    syz-executor386/5093 is trying to acquire lock:
    ffff88807c396258 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1671 [inline]
    ffff88807c396258 (sk_lock-AF_BLUETOOTH-BTPROTO_RFCOMM){+.+.}-{0:0}, at: rfcomm_sk_state_change+0x5b/0x310 net/bluetooth/rfcomm/sock.c:73
    
    but task is already holding lock:
    ffff88807badfd28 (&d->lock){+.+.}-{3:3}, at: __rfcomm_dlc_close+0x226/0x6a0 net/bluetooth/rfcomm/core.c:491
    
    Reported-by: [email protected]
    Tested-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=d7ce59b06b3eb14fd218
    Fixes: 3241ad820dbb ("[Bluetooth] Add timestamp support to L2CAP, RFCOMM and SCO")
    Signed-off-by: Luiz Augusto von Dentz <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bootconfig: Fix the kerneldoc of _xbc_exit() [+ + +]

Author: Masami Hiramatsu (Google) <[email protected]>
Date:   Tue Apr 16 06:44:04 2024 +0900

    bootconfig: Fix the kerneldoc of _xbc_exit()
    
    [ Upstream commit 298b871cd55a607037ac8af0011b9fdeb54c1e65 ]
    
    Fix the kerneldoc of _xbc_exit() which is updated to have an @early
    argument and the function name is changed.
    
    Link: https://lore.kernel.org/all/171321744474.599864.13532445969528690358.stgit@devnote2/
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Fixes: 89f9a1e876b5 ("bootconfig: use memblock_free_late to free xbc memory to buddy")
    Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bpf, x64: Fix a jit convergence issue [+ + +]

Author: Yonghong Song <[email protected]>
Date:   Wed Sep 4 15:12:51 2024 -0700

    bpf, x64: Fix a jit convergence issue
    
    [ Upstream commit c8831bdbfbab672c006a18006d36932a494b2fd6 ]
    
    Daniel Hodges reported a jit error when playing with a sched-ext program.
    The error message is:
      unexpected jmp_cond padding: -4 bytes
    
    But further investigation shows the error is actual due to failed
    convergence. The following are some analysis:
    
      ...
      pass4, final_proglen=4391:
        ...
        20e:    48 85 ff                test   rdi,rdi
        211:    74 7d                   je     0x290
        213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
        ...
        289:    48 85 ff                test   rdi,rdi
        28c:    74 17                   je     0x2a5
        28e:    e9 7f ff ff ff          jmp    0x212
        293:    bf 03 00 00 00          mov    edi,0x3
    
    Note that insn at 0x211 is 2-byte cond jump insn for offset 0x7d (-125)
    and insn at 0x28e is 5-byte jmp insn with offset -129.
    
      pass5, final_proglen=4392:
        ...
        20e:    48 85 ff                test   rdi,rdi
        211:    0f 84 80 00 00 00       je     0x297
        217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
        ...
        28d:    48 85 ff                test   rdi,rdi
        290:    74 1a                   je     0x2ac
        292:    eb 84                   jmp    0x218
        294:    bf 03 00 00 00          mov    edi,0x3
    
    Note that insn at 0x211 is 6-byte cond jump insn now since its offset
    becomes 0x80 based on previous round (0x293 - 0x213 = 0x80). At the same
    time, insn at 0x292 is a 2-byte insn since its offset is -124.
    
    pass6 will repeat the same code as in pass4. pass7 will repeat the same
    code as in pass5, and so on. This will prevent eventual convergence.
    
    Passes 1-14 are with padding = 0. At pass15, padding is 1 and related
    insn looks like:
    
        211:    0f 84 80 00 00 00       je     0x297
        217:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
        ...
        24d:    48 85 d2                test   rdx,rdx
    
    The similar code in pass14:
        211:    74 7d                   je     0x290
        213:    48 8b 77 00             mov    rsi,QWORD PTR [rdi+0x0]
        ...
        249:    48 85 d2                test   rdx,rdx
        24c:    74 21                   je     0x26f
        24e:    48 01 f7                add    rdi,rsi
        ...
    
    Before generating the following insn,
      250:    74 21                   je     0x273
    "padding = 1" enables some checking to ensure nops is either 0 or 4
    where
      #define INSN_SZ_DIFF (((addrs[i] - addrs[i - 1]) - (prog - temp)))
      nops = INSN_SZ_DIFF - 2
    
    In this specific case,
      addrs[i] = 0x24e // from pass14
      addrs[i-1] = 0x24d // from pass15
      prog - temp = 3 // from 'test rdx,rdx' in pass15
    so
      nops = -4
    and this triggers the failure.
    
    To fix the issue, we need to break cycles of je <-> jmp. For example,
    in the above case, we have
      211:    74 7d                   je     0x290
    the offset is 0x7d. If 2-byte je insn is generated only if
    the offset is less than 0x7d (<= 0x7c), the cycle can be
    break and we can achieve the convergence.
    
    I did some study on other cases like je <-> je, jmp <-> je and
    jmp <-> jmp which may cause cycles. Those cases are not from actual
    reproducible cases since it is pretty hard to construct a test case
    for them. the results show that the offset <= 0x7b (0x7b = 123) should
    be enough to cover all cases. This patch added a new helper to generate 8-bit
    cond/uncond jmp insns only if the offset range is [-128, 123].
    
    Reported-by: Daniel Hodges <[email protected]>
    Signed-off-by: Yonghong Song <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bpf: Check percpu map value size first [+ + +]

Author: Tao Chen <[email protected]>
Date:   Tue Sep 10 22:41:10 2024 +0800

    bpf: Check percpu map value size first
    
    [ Upstream commit 1d244784be6b01162b732a5a7d637dfc024c3203 ]
    
    Percpu map is often used, but the map value size limit often ignored,
    like issue: https://github.com/iovisor/bcc/issues/2519. Actually,
    percpu map value size is bound by PCPU_MIN_UNIT_SIZE, so we
    can check the value size whether it exceeds PCPU_MIN_UNIT_SIZE first,
    like percpu map of local_storage. Maybe the error message seems clearer
    compared with "cannot allocate memory".
    
    Signed-off-by: Jinke Han <[email protected]>
    Signed-off-by: Tao Chen <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Acked-by: Jiri Olsa <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/bpf/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

bpf: Prevent tail call between progs attached to different hooks [+ + +]

Author: Xu Kuohai <[email protected]>
Date:   Fri Jul 19 19:00:53 2024 +0800

    bpf: Prevent tail call between progs attached to different hooks
    
    [ Upstream commit 28ead3eaabc16ecc907cfb71876da028080f6356 ]
    
    bpf progs can be attached to kernel functions, and the attached functions
    can take different parameters or return different return values. If
    prog attached to one kernel function tail calls prog attached to another
    kernel function, the ctx access or return value verification could be
    bypassed.
    
    For example, if prog1 is attached to func1 which takes only 1 parameter
    and prog2 is attached to func2 which takes two parameters. Since verifier
    assumes the bpf ctx passed to prog2 is constructed based on func2's
    prototype, verifier allows prog2 to access the second parameter from
    the bpf ctx passed to it. The problem is that verifier does not prevent
    prog1 from passing its bpf ctx to prog2 via tail call. In this case,
    the bpf ctx passed to prog2 is constructed from func1 instead of func2,
    that is, the assumption for ctx access verification is bypassed.
    
    Another example, if BPF LSM prog1 is attached to hook file_alloc_security,
    and BPF LSM prog2 is attached to hook bpf_lsm_audit_rule_known. Verifier
    knows the return value rules for these two hooks, e.g. it is legal for
    bpf_lsm_audit_rule_known to return positive number 1, and it is illegal
    for file_alloc_security to return positive number. So verifier allows
    prog2 to return positive number 1, but does not allow prog1 to return
    positive number. The problem is that verifier does not prevent prog1
    from calling prog2 via tail call. In this case, prog2's return value 1
    will be used as the return value for prog1's hook file_alloc_security.
    That is, the return value rule is bypassed.
    
    This patch adds restriction for tail call to prevent such bypasses.
    
    Signed-off-by: Xu Kuohai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Andrii Nakryiko <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bridge: Handle error of rtnl_register_module(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Oct 8 11:47:34 2024 -0700

    bridge: Handle error of rtnl_register_module().
    
    [ Upstream commit cba5e43b0b757734b1e79f624d93a71435e31136 ]
    
    Since introduced, br_vlan_rtnl_init() has been ignoring the returned
    value of rtnl_register_module(), which could fail silently.
    
    Handling the error allows users to view a module as an all-or-nothing
    thing in terms of the rtnetlink functionality.  This prevents syzkaller
    from reporting spurious errors from its tests, where OOM often occurs
    and module is automatically loaded.
    
    Let's handle the errors by rtnl_register_many().
    
    Fixes: 8dcea187088b ("net: bridge: vlan: add rtm definitions and dump support")
    Fixes: f26b296585dc ("net: bridge: vlan: add new rtm message support")
    Fixes: adb3ce9bcb0f ("net: bridge: vlan: add del rtm message support")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Acked-by: Nikolay Aleksandrov <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

btrfs: split remaining space to discard in chunks [+ + +]

Author: Luca Stefani <[email protected]>
Date:   Tue Sep 17 22:33:04 2024 +0200

    btrfs: split remaining space to discard in chunks
    
    commit a99fcb0158978ed332009449b484e5f3ca2d7df4 upstream.
    
    Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
    mostly empty although we will do the split according to our super block
    locations, the last super block ends at 256G, we can submit a huge
    discard for the range [256G, 8T), causing a large delay.
    
    Split the space left to discard based on BTRFS_MAX_DISCARD_CHUNK_SIZE in
    preparation of introduction of cancellation points to trim. The value
    of the chunk size is arbitrary, it can be higher or derived from actual
    device capabilities but we can't easily read that using
    bio_discard_limit().
    
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
    Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
    CC: [email protected] # 5.15+
    Signed-off-by: Luca Stefani <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

btrfs: zoned: fix missing RCU locking in error message when loading zone info [+ + +]

Author: Filipe Manana <[email protected]>
Date:   Wed Oct 2 15:02:56 2024 +0100

    btrfs: zoned: fix missing RCU locking in error message when loading zone info
    
    [ Upstream commit fe4cd7ed128fe82ab9fe4f9fc8a73d4467699787 ]
    
    At btrfs_load_zone_info() we have an error path that is dereferencing
    the name of a device which is a RCU string but we are not holding a RCU
    read lock, which is incorrect.
    
    Fix this by using btrfs_err_in_rcu() instead of btrfs_err().
    
    The problem is there since commit 08e11a3db098 ("btrfs: zoned: load zone's
    allocation offset"), back then at btrfs_load_block_group_zone_info() but
    then later on that code was factored out into the helper
    btrfs_load_zone_info() by commit 09a46725cc84 ("btrfs: zoned: factor out
    per-zone logic from btrfs_load_block_group_zone_info").
    
    Fixes: 08e11a3db098 ("btrfs: zoned: load zone's allocation offset")
    Reviewed-by: Johannes Thumshirn <[email protected]>
    Reviewed-by: Qu Wenruo <[email protected]>
    Reviewed-by: Naohiro Aota <[email protected]>
    Signed-off-by: Filipe Manana <[email protected]>
    Reviewed-by: David Sterba <[email protected]>
    Signed-off-by: David Sterba <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bus: mhi: ep: Add support for async DMA read operation [+ + +]

Author: Manivannan Sadhasivam <[email protected]>
Date:   Mon Aug 21 16:53:24 2023 +0530

    bus: mhi: ep: Add support for async DMA read operation
    
    [ Upstream commit 2547beb00ddb40e55b773970622421d978f71473 ]
    
    As like the async DMA write operation, let's add support for async DMA read
    operation. In the async path, the data will be read from the transfer ring
    continuously and when the controller driver notifies the stack using the
    completion callback (mhi_ep_read_completion), then the client driver will
    be notified with the read data and the completion event will be sent to the
    host for the respective ring element (if requested by the host).
    
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Stable-dep-of: c7d0b2db5bc5 ("bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone")
    Signed-off-by: Sasha Levin <[email protected]>

bus: mhi: ep: Add support for async DMA write operation [+ + +]

Author: Manivannan Sadhasivam <[email protected]>
Date:   Thu Nov 2 20:33:18 2023 +0530

    bus: mhi: ep: Add support for async DMA write operation
    
    [ Upstream commit ee08acb58fe47fc3bc2c137965985cdb1df40b35 ]
    
    In order to optimize the data transfer, let's use the async DMA operation
    for writing (queuing) data to the host.
    
    In the async path, the completion event for the transfer ring will only be
    sent to the host when the controller driver notifies the MHI stack of the
    actual transfer completion using the callback (mhi_ep_skb_completion)
    supplied in "struct mhi_ep_buf_info".
    
    Also to accommodate the async operation, the transfer ring read offset
    (ring->rd_offset) is cached in the "struct mhi_ep_chan" and updated locally
    to let the stack queue further ring items to the controller driver. But the
    actual read offset of the transfer ring will only be updated in the
    completion callback.
    
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Stable-dep-of: c7d0b2db5bc5 ("bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone")
    Signed-off-by: Sasha Levin <[email protected]>

bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone [+ + +]

Author: Manivannan Sadhasivam <[email protected]>
Date:   Mon Jun 3 22:13:54 2024 +0530

    bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone
    
    [ Upstream commit c7d0b2db5bc5e8c0fdc67b3c8f463c3dfec92f77 ]
    
    MHI endpoint stack accidentally started allocating memory for objects from
    DMA zone since commit 62210a26cd4f ("bus: mhi: ep: Use slab allocator
    where applicable"). But there is no real need to allocate memory from this
    naturally limited DMA zone. This also causes the MHI endpoint stack to run
    out of memory while doing high bandwidth transfers.
    
    So let's switch over to normal memory.
    
    Cc: <[email protected]> # 6.8
    Fixes: 62210a26cd4f ("bus: mhi: ep: Use slab allocator where applicable")
    Reviewed-by: Mayank Rana <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

bus: mhi: ep: Introduce async read/write callbacks [+ + +]

Author: Manivannan Sadhasivam <[email protected]>
Date:   Mon Nov 27 15:35:50 2023 +0530

    bus: mhi: ep: Introduce async read/write callbacks
    
    [ Upstream commit 8b786ed8fb089e347af21d13ba5677325fcd4cd8 ]
    
    These callbacks can be implemented by the controller drivers to perform
    async read/write operation that increases the throughput.
    
    For aiding the async operation, a completion callback is also introduced.
    
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Stable-dep-of: c7d0b2db5bc5 ("bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone")
    Signed-off-by: Sasha Levin <[email protected]>

bus: mhi: ep: Rename read_from_host() and write_to_host() APIs [+ + +]

Author: Manivannan Sadhasivam <[email protected]>
Date:   Mon Nov 27 13:57:37 2023 +0530

    bus: mhi: ep: Rename read_from_host() and write_to_host() APIs
    
    [ Upstream commit 927105244f8bc48e6841826a5644c6a961e03b5d ]
    
    In the preparation for adding async API support, let's rename the existing
    APIs to read_sync() and write_sync() to make it explicit that these APIs
    are used for synchronous read/write.
    
    Signed-off-by: Manivannan Sadhasivam <[email protected]>
    Stable-dep-of: c7d0b2db5bc5 ("bus: mhi: ep: Do not allocate memory for MHI objects from DMA zone")
    Signed-off-by: Sasha Levin <[email protected]>

clk: bcm: bcm53573: fix OF node leak in init [+ + +]

Author: Krzysztof Kozlowski <[email protected]>
Date:   Mon Aug 26 08:58:01 2024 +0200

    clk: bcm: bcm53573: fix OF node leak in init
    
    [ Upstream commit f92d67e23b8caa81f6322a2bad1d633b00ca000e ]
    
    Driver code is leaking OF node reference from of_get_parent() in
    bcm53573_ilp_init().  Usage of of_get_parent() is not needed in the
    first place, because the parent node will not be freed while we are
    processing given node (triggered by CLK_OF_DECLARE()).  Thus fix the
    leak by accessing parent directly, instead of of_get_parent().
    
    Signed-off-by: Krzysztof Kozlowski <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Stephen Boyd <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

clk: imx: Remove CLK_SET_PARENT_GATE for DRAM mux for i.MX7D [+ + +]

Author: Peng Fan <[email protected]>
Date:   Fri Jun 7 21:33:39 2024 +0800

    clk: imx: Remove CLK_SET_PARENT_GATE for DRAM mux for i.MX7D
    
    [ Upstream commit a54c441b46a0745683c2eef5a359d22856d27323 ]
    
    For i.MX7D DRAM related mux clock, the clock source change should ONLY
    be done done in low level asm code without accessing DRAM, and then
    calling clk API to sync the HW clock status with clk tree, it should never
    touch real clock source switch via clk API, so CLK_SET_PARENT_GATE flag
    should NOT be added, otherwise, DRAM's clock parent will be disabled when
    DRAM is active, and system will hang.
    
    Signed-off-by: Peng Fan <[email protected]>
    Reviewed-by: Abel Vesa <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Abel Vesa <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

comedi: ni_routing: tools: Check when the file could not be opened [+ + +]

Author: Ruffalo Lavoisier <[email protected]>
Date:   Sat Sep 7 05:30:25 2024 +0900

    comedi: ni_routing: tools: Check when the file could not be opened
    
    [ Upstream commit 5baeb157b341b1d26a5815aeaa4d3bb9e0444fda ]
    
    - After fopen check NULL before using the file pointer use
    
    Signed-off-by: Ruffalo Lavoisier <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

device-dax: correct pgoff align in dax_set_mapping() [+ + +]

Author: Kun(llfl) <[email protected]>
Date:   Fri Sep 27 15:45:09 2024 +0800

    device-dax: correct pgoff align in dax_set_mapping()
    
    commit 7fcbd9785d4c17ea533c42f20a9083a83f301fa6 upstream.
    
    pgoff should be aligned using ALIGN_DOWN() instead of ALIGN().  Otherwise,
    vmf->address not aligned to fault_size will be aligned to the next
    alignment, that can result in memory failure getting the wrong address.
    
    It's a subtle situation that only can be observed in
    page_mapped_in_vma() after the page is page fault handled by
    dev_dax_huge_fault.  Generally, there is little chance to perform
    page_mapped_in_vma in dev-dax's page unless in specific error injection
    to the dax device to trigger an MCE - memory-failure.  In that case,
    page_mapped_in_vma() will be triggered to determine which task is
    accessing the failure address and kill that task in the end.
    
    
    We used self-developed dax device (which is 2M aligned mapping) , to
    perform error injection to random address.  It turned out that error
    injected to non-2M-aligned address was causing endless MCE until panic.
    Because page_mapped_in_vma() kept resulting wrong address and the task
    accessing the failure address was never killed properly:
    
    
    [ 3783.719419] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3784.049006] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3784.049190] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3784.448042] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3784.448186] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3784.792026] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3784.792179] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3785.162502] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3785.162633] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3785.461116] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3785.461247] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3785.764730] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3785.764859] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3786.042128] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3786.042259] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3786.464293] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3786.464423] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3786.818090] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3786.818217] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    [ 3787.085297] mce: Uncorrected hardware memory error in user-access at
    200c9742380
    [ 3787.085424] Memory failure: 0x200c9742: recovery action for dax page:
    Recovered
    
    It took us several weeks to pinpoint this problem,  but we eventually
    used bpftrace to trace the page fault and mce address and successfully
    identified the issue.
    
    
    Joao added:
    
    ; Likely we never reproduce in production because we always pin
    : device-dax regions in the region align they provide (Qemu does
    : similarly with prealloc in hugetlb/file backed memory).  I think this
    : bug requires that we touch *unpinned* device-dax regions unaligned to
    : the device-dax selected alignment (page size i.e.  4K/2M/1G)
    
    Link: https://lkml.kernel.org/r/23c02a03e8d666fef11bbe13e85c69c8b4ca0624.1727421694.git.llfl@linux.alibaba.com
    Fixes: b9b5777f09be ("device-dax: use ALIGN() for determining pgoff")
    Signed-off-by: Kun(llfl) <[email protected]>
    Tested-by: JianXiong Zhao <[email protected]>
    Reviewed-by: Joao Martins <[email protected]>
    Cc: Dan Williams <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

driver core: bus: Fix double free in driver API bus_register() [+ + +]

Author: Zijun Hu <[email protected]>
Date:   Sat Jul 27 16:34:01 2024 +0800

    driver core: bus: Fix double free in driver API bus_register()
    
    [ Upstream commit bfa54a793ba77ef696755b66f3ac4ed00c7d1248 ]
    
    For bus_register(), any error which happens after kset_register() will
    cause that @priv are freed twice, fixed by setting @priv with NULL after
    the first free.
    
    Signed-off-by: Zijun Hu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute [+ + +]

Author: Zijun Hu <[email protected]>
Date:   Wed Jul 24 21:54:48 2024 +0800

    driver core: bus: Return -EIO instead of 0 when show/store invalid bus attribute
    
    [ Upstream commit c0fd973c108cdc22a384854bc4b3e288a9717bb2 ]
    
    Return -EIO instead of 0 for below erroneous bus attribute operations:
     - read a bus attribute without show().
     - write a bus attribute without store().
    
    Signed-off-by: Zijun Hu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Check null pointer before dereferencing se [+ + +]

Author: Alex Hung <[email protected]>
Date:   Thu Aug 29 17:30:26 2024 -0600

    drm/amd/display: Check null pointer before dereferencing se
    
    [ Upstream commit ff599ef6970ee000fa5bc38d02fa5ff5f3fc7575 ]
    
    [WHAT & HOW]
    se is null checked previously in the same function, indicating
    it might be null; therefore, it must be checked when used again.
    
    This fixes 1 FORWARD_NULL issue reported by Coverity.
    
    Acked-by: Alex Hung <[email protected]>
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Remove a redundant check in authenticated_dp [+ + +]

Author: Wenjing Liu <[email protected]>
Date:   Fri Jun 14 11:01:59 2024 -0400

    drm/amd/display: Remove a redundant check in authenticated_dp
    
    [ Upstream commit 4b22869f76563ce1e10858d2ae3305affa8d4a6a ]
    
    [WHY]
    mod_hdcp_execute_and_set returns (*status == MOD_HDCP_STATUS_SUCCESS).
    When it return 0, it is guaranteed that status == MOD_HDCP_STATUS_SUCCESS
    will be evaluated as false. Since now we are using goto out already, all 3
    if (status == MOD_HDCP_STATUS_SUCCESS) clauses are guaranteed to enter.
    Therefore we are removing the if statements due to redundancy.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Signed-off-by: Wenjing Liu <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Stable-dep-of: bc2fe69f16c7 ("drm/amd/display: Revert "Check HDCP returned status"")
    Signed-off-by: Sasha Levin <[email protected]>

drm/amd/display: Revert "Check HDCP returned status" [+ + +]

Author: Alex Hung <[email protected]>
Date:   Tue Jun 25 13:06:43 2024 -0600

    drm/amd/display: Revert "Check HDCP returned status"
    
    [ Upstream commit bc2fe69f16c7122b5dabc294aa2d6065d8da2169 ]
    
    This reverts commit 5d93060d430b359e16e7c555c8f151ead1ac614b due to a
    power consumption regression.
    
    Reviewed-by: Rodrigo Siqueira <[email protected]>
    Signed-off-by: Jerry Zuo <[email protected]>
    Signed-off-by: Alex Hung <[email protected]>
    Tested-by: Daniel Wheeler <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/crtc: fix uninitialized variable use even harder [+ + +]

Author: Rob Clark <[email protected]>
Date:   Mon Feb 12 13:55:34 2024 -0800

    drm/crtc: fix uninitialized variable use even harder
    
    [ Upstream commit b6802b61a9d0e99dcfa6fff7c50db7c48a9623d3 ]
    
    DRM_MODESET_LOCK_ALL_BEGIN() has a hidden trap-door (aka retry loop),
    which means we can't rely too much on variable initializers.
    
    Fixes: 6e455f5dcdd1 ("drm/crtc: fix uninitialized variable use")
    Signed-off-by: Rob Clark <[email protected]>
    Reviewed-by: Daniel Vetter <[email protected]>
    Reviewed-by: Abhinav Kumar <[email protected]>
    Reviewed-by: Dmitry Baryshkov <[email protected]>
    Tested-by: Dmitry Baryshkov <[email protected]> # sc7180, sdm845
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Dmitry Baryshkov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

drm/i915/hdcp: fix connector refcounting [+ + +]

Author: Jani Nikula <[email protected]>
Date:   Tue Sep 24 18:30:22 2024 +0300

    drm/i915/hdcp: fix connector refcounting
    
    commit 4cc2718f621a6a57a02581125bb6d914ce74d23b upstream.
    
    We acquire a connector reference before scheduling an HDCP prop work,
    and expect the work function to release the reference.
    
    However, if the work was already queued, it won't be queued multiple
    times, and the reference is not dropped.
    
    Release the reference immediately if the work was already queued.
    
    Fixes: a6597faa2d59 ("drm/i915: Protect workers against disappearing connectors")
    Cc: Sean Paul <[email protected]>
    Cc: Suraj Kandpal <[email protected]>
    Cc: Ville Syrjälä <[email protected]>
    Cc: [email protected] # v5.10+
    Reviewed-by: Suraj Kandpal <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Jani Nikula <[email protected]>
    (cherry picked from commit abc0742c79bdb3b164eacab24aea0916d2ec1cb5)
    Signed-off-by: Joonas Lahtinen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/nouveau: pass cli to nouveau_channel_new() instead of drm+device [+ + +]

Author: Ben Skeggs <[email protected]>
Date:   Fri Jul 26 14:38:22 2024 +1000

    drm/nouveau: pass cli to nouveau_channel_new() instead of drm+device
    
    [ Upstream commit 5cca41ac70e5877383ed925bd017884c37edf09b ]
    
    Both of these are stored in nouveau_cli already, and also allows the
    removal of some void casts.
    
    Signed-off-by: Ben Skeggs <[email protected]>
    Signed-off-by: Danilo Krummrich <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Stable-dep-of: 04e0481526e3 ("nouveau/dmem: Fix privileged error in copy engine channel")
    Signed-off-by: Sasha Levin <[email protected]>

drm/panel: boe-tv101wum-nl6: Fine tune Himax83102-j02 panel HFP and HBP (again) [+ + +]

Author: Cong Yang <[email protected]>
Date:   Fri Mar 1 14:11:28 2024 +0800

    drm/panel: boe-tv101wum-nl6: Fine tune Himax83102-j02 panel HFP and HBP (again)
    
    [ Upstream commit 9dfc46c87cdc8f5a42a71de247a744a6b8188980 ]
    
    The current measured frame rate is 59.95Hz, which does not meet the
    requirements of touch-stylus and stylus cannot work normally. After
    adjustment, the actual measurement is 60.001Hz. Now this panel looks
    like it's only used by me on the MTK platform, so let's change this
    set of parameters.
    
    [ dianders: Added "(again") to subject and fixed the "Fixes" line ]
    
    Fixes: cea7008190ad ("drm/panel: boe-tv101wum-nl6: Fine tune Himax83102-j02 panel HFP and HBP")
    Signed-off-by: Cong Yang <[email protected]>
    Signed-off-by: Douglas Anderson <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240301061128.3145982-1-yangcong5@huaqin.corp-partner.google.com
    Signed-off-by: Sasha Levin <[email protected]>

drm/v3d: Stop the active perfmon before being destroyed [+ + +]

Author: Maíra Canal <[email protected]>
Date:   Fri Oct 4 10:02:29 2024 -0300

    drm/v3d: Stop the active perfmon before being destroyed
    
    commit 7d1fd3638ee3a9f9bca4785fffb638ca19120718 upstream.
    
    When running `kmscube` with one or more performance monitors enabled
    via `GALLIUM_HUD`, the following kernel panic can occur:
    
    [   55.008324] Unable to handle kernel paging request at virtual address 00000000052004a4
    [   55.008368] Mem abort info:
    [   55.008377]   ESR = 0x0000000096000005
    [   55.008387]   EC = 0x25: DABT (current EL), IL = 32 bits
    [   55.008402]   SET = 0, FnV = 0
    [   55.008412]   EA = 0, S1PTW = 0
    [   55.008421]   FSC = 0x05: level 1 translation fault
    [   55.008434] Data abort info:
    [   55.008442]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
    [   55.008455]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    [   55.008467]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    [   55.008481] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001046c6000
    [   55.008497] [00000000052004a4] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
    [   55.008525] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
    [   55.008542] Modules linked in: rfcomm [...] vc4 v3d snd_soc_hdmi_codec drm_display_helper
    gpu_sched drm_shmem_helper cec drm_dma_helper drm_kms_helper i2c_brcmstb
    drm drm_panel_orientation_quirks snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer snd backlight
    [   55.008799] CPU: 2 PID: 166 Comm: v3d_bin Tainted: G         C         6.6.47+rpt-rpi-v8 #1  Debian 1:6.6.47-1+rpt1
    [   55.008824] Hardware name: Raspberry Pi 4 Model B Rev 1.5 (DT)
    [   55.008838] pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [   55.008855] pc : __mutex_lock.constprop.0+0x90/0x608
    [   55.008879] lr : __mutex_lock.constprop.0+0x58/0x608
    [   55.008895] sp : ffffffc080673cf0
    [   55.008904] x29: ffffffc080673cf0 x28: 0000000000000000 x27: ffffff8106188a28
    [   55.008926] x26: ffffff8101e78040 x25: ffffff8101baa6c0 x24: ffffffd9d989f148
    [   55.008947] x23: ffffffda1c2a4008 x22: 0000000000000002 x21: ffffffc080673d38
    [   55.008968] x20: ffffff8101238000 x19: ffffff8104f83188 x18: 0000000000000000
    [   55.008988] x17: 0000000000000000 x16: ffffffda1bd04d18 x15: 00000055bb08bc90
    [   55.009715] x14: 0000000000000000 x13: 0000000000000000 x12: ffffffda1bd4cbb0
    [   55.010433] x11: 00000000fa83b2da x10: 0000000000001a40 x9 : ffffffda1bd04d04
    [   55.011162] x8 : ffffff8102097b80 x7 : 0000000000000000 x6 : 00000000030a5857
    [   55.011880] x5 : 00ffffffffffffff x4 : 0300000005200470 x3 : 0300000005200470
    [   55.012598] x2 : ffffff8101238000 x1 : 0000000000000021 x0 : 0300000005200470
    [   55.013292] Call trace:
    [   55.013959]  __mutex_lock.constprop.0+0x90/0x608
    [   55.014646]  __mutex_lock_slowpath+0x1c/0x30
    [   55.015317]  mutex_lock+0x50/0x68
    [   55.015961]  v3d_perfmon_stop+0x40/0xe0 [v3d]
    [   55.016627]  v3d_bin_job_run+0x10c/0x2d8 [v3d]
    [   55.017282]  drm_sched_main+0x178/0x3f8 [gpu_sched]
    [   55.017921]  kthread+0x11c/0x128
    [   55.018554]  ret_from_fork+0x10/0x20
    [   55.019168] Code: f9400260 f1001c1f 54001ea9 927df000 (b9403401)
    [   55.019776] ---[ end trace 0000000000000000 ]---
    [   55.020411] note: v3d_bin[166] exited with preempt_count 1
    
    This issue arises because, upon closing the file descriptor (which happens
    when we interrupt `kmscube`), the active performance monitor is not
    stopped. Although all perfmons are destroyed in `v3d_perfmon_close_file()`,
    the active performance monitor's pointer (`v3d->active_perfmon`) is still
    retained.
    
    If `kmscube` is run again, the driver will attempt to stop the active
    performance monitor using the stale pointer in `v3d->active_perfmon`.
    However, this pointer is no longer valid because the previous process has
    already terminated, and all performance monitors associated with it have
    been destroyed and freed.
    
    To fix this, when the active performance monitor belongs to a given
    process, explicitly stop it before destroying and freeing it.
    
    Cc: [email protected] # v5.15+
    Closes: https://github.com/raspberrypi/linux/issues/6389
    Fixes: 26a4dc29b74a ("drm/v3d: Expose performance counters to userspace")
    Signed-off-by: Maíra Canal <[email protected]>
    Reviewed-by: Juan A. Suarez <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

drm/vc4: Stop the active perfmon before being destroyed [+ + +]

Author: Maíra Canal <[email protected]>
Date:   Fri Oct 4 09:36:00 2024 -0300

    drm/vc4: Stop the active perfmon before being destroyed
    
    commit 0b2ad4f6f2bec74a5287d96cb2325a5e11706f22 upstream.
    
    Upon closing the file descriptor, the active performance monitor is not
    stopped. Although all perfmons are destroyed in `vc4_perfmon_close_file()`,
    the active performance monitor's pointer (`vc4->active_perfmon`) is still
    retained.
    
    If we open a new file descriptor and submit a few jobs with performance
    monitors, the driver will attempt to stop the active performance monitor
    using the stale pointer in `vc4->active_perfmon`. However, this pointer
    is no longer valid because the previous process has already terminated,
    and all performance monitors associated with it have been destroyed and
    freed.
    
    To fix this, when the active performance monitor belongs to a given
    process, explicitly stop it before destroying and freeing it.
    
    Cc: [email protected] # v4.17+
    Cc: Boris Brezillon <[email protected]>
    Cc: Juan A. Suarez Romero <[email protected]>
    Fixes: 65101d8c9108 ("drm/vc4: Expose performance counters to userspace")
    Signed-off-by: Maíra Canal <[email protected]>
    Reviewed-by: Juan A. Suarez <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

e1000e: change I219 (19) devices to ADP [+ + +]

Author: Vitaly Lifshits <[email protected]>
Date:   Sun Sep 8 09:49:17 2024 +0300

    e1000e: change I219 (19) devices to ADP
    
    [ Upstream commit 9d9e5347b035412daa844f884b94a05bac94f864 ]
    
    Sporadic issues, such as PHY access loss, have been observed on I219 (19)
    devices. It was found that these devices have hardware more closely
    related to ADP than MTP and the issues were caused by taking MTP-specific
    flows.
    
    Change the MAC and board types of these devices from MTP to ADP to
    correctly reflect the LAN hardware, and flows, of these devices.
    
    Fixes: db2d737d63c5 ("e1000e: Separate MTP board type from ADP")
    Signed-off-by: Vitaly Lifshits <[email protected]>
    Tested-by: Mor Bar-Gabay <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

e1000e: fix force smbus during suspend flow [+ + +]

Author: Vitaly Lifshits <[email protected]>
Date:   Tue Jul 9 13:31:22 2024 -0700

    e1000e: fix force smbus during suspend flow
    
    commit 76a0a3f9cc2fbd0e56671706bb74a9a988397898 upstream.
    
    Commit 861e8086029e ("e1000e: move force SMBUS from enable ulp function
    to avoid PHY loss issue") resolved a PHY access loss during suspend on
    Meteor Lake consumer platforms, but it affected corporate systems
    incorrectly.
    
    A better fix, working for both consumer and corporate systems, was
    proposed in commit bfd546a552e1 ("e1000e: move force SMBUS near the end
    of enable_ulp function"). However, it introduced a regression on older
    devices, such as [8086:15B8], [8086:15F9], [8086:15BE].
    
    This patch aims to fix the secondary regression, by limiting the scope of
    the changes to Meteor Lake platforms only.
    
    Fixes: bfd546a552e1 ("e1000e: move force SMBUS near the end of enable_ulp function")
    Reported-by: Todd Brandt <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218940
    Reported-by: Dieter Mummenschanz <[email protected]>
    Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218936
    Signed-off-by: Vitaly Lifshits <[email protected]>
    Tested-by: Mor Bar-Gabay <[email protected]> (A Contingent Worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

e1000e: move force SMBUS near the end of enable_ulp function [+ + +]

Author: Hui Wang <[email protected]>
Date:   Tue May 28 15:06:04 2024 -0700

    e1000e: move force SMBUS near the end of enable_ulp function
    
    [ Upstream commit bfd546a552e140b0a4c8a21527c39d6d21addb28 ]
    
    The commit 861e8086029e ("e1000e: move force SMBUS from enable ulp
    function to avoid PHY loss issue") introduces a regression on
    PCH_MTP_I219_LM18 (PCIID: 0x8086550A). Without the referred commit, the
    ethernet works well after suspend and resume, but after applying the
    commit, the ethernet couldn't work anymore after the resume and the
    dmesg shows that the NIC link changes to 10Mbps (1000Mbps originally):
    
        [   43.305084] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 10 Mbps Full Duplex, Flow Control: Rx/Tx
    
    Without the commit, the force SMBUS code will not be executed if
    "return 0" or "goto out" is executed in the enable_ulp(), and in my
    case, the "goto out" is executed since FWSM_FW_VALID is set. But after
    applying the commit, the force SMBUS code will be ran unconditionally.
    
    Here move the force SMBUS code back to enable_ulp() and put it
    immediately ahead of hw->phy.ops.release(hw), this could allow the
    longest settling time as possible for interface in this function and
    doesn't change the original code logic.
    
    The issue was found on a Lenovo laptop with the ethernet hw as below:
    00:1f.6 Ethernet controller [0200]: Intel Corporation Device [8086:550a]
    (rev 20).
    
    And this patch is verified (cable plug and unplug, system suspend
    and resume) on Lenovo laptops with ethernet hw: [8086:550a],
    [8086:550b], [8086:15bb], [8086:15be], [8086:1a1f], [8086:1a1c] and
    [8086:0dc7].
    
    Fixes: 861e8086029e ("e1000e: move force SMBUS from enable ulp function to avoid PHY loss issue")
    Signed-off-by: Hui Wang <[email protected]>
    Acked-by: Vitaly Lifshits <[email protected]>
    Tested-by: Naama Meir <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Reviewed-by: Paul Menzel <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Tested-by: Zhang Rui <[email protected]>
    Signed-off-by: Jacob Keller <[email protected]>
    Link: https://lore.kernel.org/r/20240528-net-2024-05-28-intel-net-fixes-v1-1-dc8593d2bbc6@intel.com
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: don't set SB_RDONLY after filesystem errors [+ + +]

Author: Jan Kara <[email protected]>
Date:   Mon Aug 5 22:12:41 2024 +0200

    ext4: don't set SB_RDONLY after filesystem errors
    
    [ Upstream commit d3476f3dad4ad68ae5f6b008ea6591d1520da5d8 ]
    
    When the filesystem is mounted with errors=remount-ro, we were setting
    SB_RDONLY flag to stop all filesystem modifications. We knew this misses
    proper locking (sb->s_umount) and does not go through proper filesystem
    remount procedure but it has been the way this worked since early ext2
    days and it was good enough for catastrophic situation damage
    mitigation. Recently, syzbot has found a way (see link) to trigger
    warnings in filesystem freezing because the code got confused by
    SB_RDONLY changing under its hands. Since these days we set
    EXT4_FLAGS_SHUTDOWN on the superblock which is enough to stop all
    filesystem modifications, modifying SB_RDONLY shouldn't be needed. So
    stop doing that.
    
    Link: https://lore.kernel.org/all/[email protected]
    Reported-by: Christian Brauner <[email protected]>
    Signed-off-by: Jan Kara <[email protected]>
    Reviewed-by: Christian Brauner <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ext4: nested locking for xattr inode [+ + +]

Author: Wojciech Gładysz <[email protected]>
Date:   Thu Aug 1 16:38:27 2024 +0200

    ext4: nested locking for xattr inode
    
    [ Upstream commit d1bc560e9a9c78d0b2314692847fc8661e0aeb99 ]
    
    Add nested locking with I_MUTEX_XATTR subclass to avoid lockdep warning
    while handling xattr inode on file open syscall at ext4_xattr_inode_iget.
    
    Backtrace
    EXT4-fs (loop0): Ignoring removed oldalloc option
    ======================================================
    WARNING: possible circular locking dependency detected
    5.10.0-syzkaller #0 Not tainted
    ------------------------------------------------------
    syz-executor543/2794 is trying to acquire lock:
    ffff8880215e1a48 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}, at: inode_lock include/linux/fs.h:782 [inline]
    ffff8880215e1a48 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}, at: ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425
    
    but task is already holding lock:
    ffff8880215e3278 (&ei->i_data_sem/3){++++}-{3:3}, at: ext4_setattr+0x136d/0x19c0 fs/ext4/inode.c:5559
    
    which lock already depends on the new lock.
    
    the existing dependency chain (in reverse order) is:
    
    -> #1 (&ei->i_data_sem/3){++++}-{3:3}:
           lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566
           down_write+0x93/0x180 kernel/locking/rwsem.c:1564
           ext4_update_i_disksize fs/ext4/ext4.h:3267 [inline]
           ext4_xattr_inode_write fs/ext4/xattr.c:1390 [inline]
           ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1538 [inline]
           ext4_xattr_set_entry+0x331a/0x3d80 fs/ext4/xattr.c:1662
           ext4_xattr_ibody_set+0x124/0x390 fs/ext4/xattr.c:2228
           ext4_xattr_set_handle+0xc27/0x14e0 fs/ext4/xattr.c:2385
           ext4_xattr_set+0x219/0x390 fs/ext4/xattr.c:2498
           ext4_xattr_user_set+0xc9/0xf0 fs/ext4/xattr_user.c:40
           __vfs_setxattr+0x404/0x450 fs/xattr.c:177
           __vfs_setxattr_noperm+0x11d/0x4f0 fs/xattr.c:208
           __vfs_setxattr_locked+0x1f9/0x210 fs/xattr.c:266
           vfs_setxattr+0x112/0x2c0 fs/xattr.c:283
           setxattr+0x1db/0x3e0 fs/xattr.c:548
           path_setxattr+0x15a/0x240 fs/xattr.c:567
           __do_sys_setxattr fs/xattr.c:582 [inline]
           __se_sys_setxattr fs/xattr.c:578 [inline]
           __x64_sys_setxattr+0xc5/0xe0 fs/xattr.c:578
           do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62
           entry_SYSCALL_64_after_hwframe+0x61/0xcb
    
    -> #0 (&ea_inode->i_rwsem#7/1){+.+.}-{3:3}:
           check_prev_add kernel/locking/lockdep.c:2988 [inline]
           check_prevs_add kernel/locking/lockdep.c:3113 [inline]
           validate_chain+0x1695/0x58f0 kernel/locking/lockdep.c:3729
           __lock_acquire+0x12fd/0x20d0 kernel/locking/lockdep.c:4955
           lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566
           down_write+0x93/0x180 kernel/locking/rwsem.c:1564
           inode_lock include/linux/fs.h:782 [inline]
           ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425
           ext4_xattr_inode_get+0x138/0x410 fs/ext4/xattr.c:485
           ext4_xattr_move_to_block fs/ext4/xattr.c:2580 [inline]
           ext4_xattr_make_inode_space fs/ext4/xattr.c:2682 [inline]
           ext4_expand_extra_isize_ea+0xe70/0x1bb0 fs/ext4/xattr.c:2774
           __ext4_expand_extra_isize+0x304/0x3f0 fs/ext4/inode.c:5898
           ext4_try_to_expand_extra_isize fs/ext4/inode.c:5941 [inline]
           __ext4_mark_inode_dirty+0x591/0x810 fs/ext4/inode.c:6018
           ext4_setattr+0x1400/0x19c0 fs/ext4/inode.c:5562
           notify_change+0xbb6/0xe60 fs/attr.c:435
           do_truncate+0x1de/0x2c0 fs/open.c:64
           handle_truncate fs/namei.c:2970 [inline]
           do_open fs/namei.c:3311 [inline]
           path_openat+0x29f3/0x3290 fs/namei.c:3425
           do_filp_open+0x20b/0x450 fs/namei.c:3452
           do_sys_openat2+0x124/0x460 fs/open.c:1207
           do_sys_open fs/open.c:1223 [inline]
           __do_sys_open fs/open.c:1231 [inline]
           __se_sys_open fs/open.c:1227 [inline]
           __x64_sys_open+0x221/0x270 fs/open.c:1227
           do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62
           entry_SYSCALL_64_after_hwframe+0x61/0xcb
    
    other info that might help us debug this:
    
     Possible unsafe locking scenario:
    
           CPU0                    CPU1
           ----                    ----
      lock(&ei->i_data_sem/3);
                                   lock(&ea_inode->i_rwsem#7/1);
                                   lock(&ei->i_data_sem/3);
      lock(&ea_inode->i_rwsem#7/1);
    
     *** DEADLOCK ***
    
    5 locks held by syz-executor543/2794:
     #0: ffff888026fbc448 (sb_writers#4){.+.+}-{0:0}, at: mnt_want_write+0x4a/0x2a0 fs/namespace.c:365
     #1: ffff8880215e3488 (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at: inode_lock include/linux/fs.h:782 [inline]
     #1: ffff8880215e3488 (&sb->s_type->i_mutex_key#7){++++}-{3:3}, at: do_truncate+0x1cf/0x2c0 fs/open.c:62
     #2: ffff8880215e3310 (&ei->i_mmap_sem){++++}-{3:3}, at: ext4_setattr+0xec4/0x19c0 fs/ext4/inode.c:5519
     #3: ffff8880215e3278 (&ei->i_data_sem/3){++++}-{3:3}, at: ext4_setattr+0x136d/0x19c0 fs/ext4/inode.c:5559
     #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: ext4_write_trylock_xattr fs/ext4/xattr.h:162 [inline]
     #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: ext4_try_to_expand_extra_isize fs/ext4/inode.c:5938 [inline]
     #4: ffff8880215e30c8 (&ei->xattr_sem){++++}-{3:3}, at: __ext4_mark_inode_dirty+0x4fb/0x810 fs/ext4/inode.c:6018
    
    stack backtrace:
    CPU: 1 PID: 2794 Comm: syz-executor543 Not tainted 5.10.0-syzkaller #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
    Call Trace:
     __dump_stack lib/dump_stack.c:77 [inline]
     dump_stack+0x177/0x211 lib/dump_stack.c:118
     print_circular_bug+0x146/0x1b0 kernel/locking/lockdep.c:2002
     check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2123
     check_prev_add kernel/locking/lockdep.c:2988 [inline]
     check_prevs_add kernel/locking/lockdep.c:3113 [inline]
     validate_chain+0x1695/0x58f0 kernel/locking/lockdep.c:3729
     __lock_acquire+0x12fd/0x20d0 kernel/locking/lockdep.c:4955
     lock_acquire+0x197/0x480 kernel/locking/lockdep.c:5566
     down_write+0x93/0x180 kernel/locking/rwsem.c:1564
     inode_lock include/linux/fs.h:782 [inline]
     ext4_xattr_inode_iget+0x42a/0x5c0 fs/ext4/xattr.c:425
     ext4_xattr_inode_get+0x138/0x410 fs/ext4/xattr.c:485
     ext4_xattr_move_to_block fs/ext4/xattr.c:2580 [inline]
     ext4_xattr_make_inode_space fs/ext4/xattr.c:2682 [inline]
     ext4_expand_extra_isize_ea+0xe70/0x1bb0 fs/ext4/xattr.c:2774
     __ext4_expand_extra_isize+0x304/0x3f0 fs/ext4/inode.c:5898
     ext4_try_to_expand_extra_isize fs/ext4/inode.c:5941 [inline]
     __ext4_mark_inode_dirty+0x591/0x810 fs/ext4/inode.c:6018
     ext4_setattr+0x1400/0x19c0 fs/ext4/inode.c:5562
     notify_change+0xbb6/0xe60 fs/attr.c:435
     do_truncate+0x1de/0x2c0 fs/open.c:64
     handle_truncate fs/namei.c:2970 [inline]
     do_open fs/namei.c:3311 [inline]
     path_openat+0x29f3/0x3290 fs/namei.c:3425
     do_filp_open+0x20b/0x450 fs/namei.c:3452
     do_sys_openat2+0x124/0x460 fs/open.c:1207
     do_sys_open fs/open.c:1223 [inline]
     __do_sys_open fs/open.c:1231 [inline]
     __se_sys_open fs/open.c:1227 [inline]
     __x64_sys_open+0x221/0x270 fs/open.c:1227
     do_syscall_64+0x6d/0xa0 arch/x86/entry/common.c:62
     entry_SYSCALL_64_after_hwframe+0x61/0xcb
    RIP: 0033:0x7f0cde4ea229
    Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 21 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
    RSP: 002b:00007ffd81d1c978 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
    RAX: ffffffffffffffda RBX: 0030656c69662f30 RCX: 00007f0cde4ea229
    RDX: 0000000000000089 RSI: 00000000000a0a00 RDI: 00000000200001c0
    RBP: 2f30656c69662f2e R08: 0000000000208000 R09: 0000000000208000
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffd81d1c9c0
    R13: 00007ffd81d1ca00 R14: 0000000000080000 R15: 0000000000000003
    EXT4-fs error (device loop0): ext4_expand_extra_isize_ea:2730: inode #13: comm syz-executor543: corrupted in-inode xattr
    
    Signed-off-by: Wojciech Gładysz <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fbcon: Fix a NULL pointer dereference issue in fbcon_putcs [+ + +]

Author: Qianqiang Liu <[email protected]>
Date:   Wed Sep 25 13:29:36 2024 +0800

    fbcon: Fix a NULL pointer dereference issue in fbcon_putcs
    
    [ Upstream commit 5b97eebcce1b4f3f07a71f635d6aa3af96c236e7 ]
    
    syzbot has found a NULL pointer dereference bug in fbcon.
    Here is the simplified C reproducer:
    
    struct param {
            uint8_t type;
            struct tiocl_selection ts;
    };
    
    int main()
    {
            struct fb_con2fbmap con2fb;
            struct param param;
    
            int fd = open("/dev/fb1", 0, 0);
    
            con2fb.console = 0x19;
            con2fb.framebuffer = 0;
            ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb);
    
            param.type = 2;
            param.ts.xs = 0; param.ts.ys = 0;
            param.ts.xe = 0; param.ts.ye = 0;
            param.ts.sel_mode = 0;
    
            int fd1 = open("/dev/tty1", O_RDWR, 0);
            ioctl(fd1, TIOCLINUX, ¶m);
    
            con2fb.console = 1;
            con2fb.framebuffer = 0;
            ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb);
    
            return 0;
    }
    
    After calling ioctl(fd1, TIOCLINUX, ¶m), the subsequent ioctl(fd, FBIOPUT_CON2FBMAP, &con2fb)
    causes the kernel to follow a different execution path:
    
     set_con2fb_map
      -> con2fb_init_display
       -> fbcon_set_disp
        -> redraw_screen
         -> hide_cursor
          -> clear_selection
           -> highlight
            -> invert_screen
             -> do_update_region
              -> fbcon_putcs
               -> ops->putcs
    
    Since ops->putcs is a NULL pointer, this leads to a kernel panic.
    To prevent this, we need to call set_blitting_type() within set_con2fb_map()
    to properly initialize ops->putcs.
    
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=3d613ae53c031502687a
    Tested-by: [email protected]
    Signed-off-by: Qianqiang Liu <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fbdev: sisfb: Fix strbuf array overflow [+ + +]

Author: Andrey Shumilin <[email protected]>
Date:   Fri Sep 27 22:34:24 2024 +0300

    fbdev: sisfb: Fix strbuf array overflow
    
    [ Upstream commit 9cf14f5a2746c19455ce9cb44341b5527b5e19c3 ]
    
    The values of the variables xres and yres are placed in strbuf.
    These variables are obtained from strbuf1.
    The strbuf1 array contains digit characters
    and a space if the array contains non-digit characters.
    Then, when executing sprintf(strbuf, "%ux%ux8", xres, yres);
    more than 16 bytes will be written to strbuf.
    It is suggested to increase the size of the strbuf array to 24.
    
    Found by Linux Verification Center (linuxtesting.org) with SVACE.
    
    Signed-off-by: Andrey Shumilin <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fs/ntfs3: Do not call file_modified if collapse range failed [+ + +]

Author: Konstantin Komarov <[email protected]>
Date:   Fri Jun 28 18:29:46 2024 +0300

    fs/ntfs3: Do not call file_modified if collapse range failed
    
    [ Upstream commit 2db86f7995fe6b62a4d6fee9f3cdeba3c6d27606 ]
    
    Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation")
    Signed-off-by: Konstantin Komarov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fs/ntfs3: Fix sparse warning in ni_fiemap [+ + +]

Author: Konstantin Komarov <[email protected]>
Date:   Mon Aug 19 16:23:02 2024 +0300

    fs/ntfs3: Fix sparse warning in ni_fiemap
    
    [ Upstream commit 62fea783f96ce825f0ac9e40ce9530ddc1ea2a29 ]
    
    The interface of fiemap_fill_next_extent_k() was modified
    to eliminate the sparse warning.
    
    Fixes: d57431c6f511 ("fs/ntfs3: Do copy_to_user out of run_lock")
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Signed-off-by: Konstantin Komarov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fs/ntfs3: Refactor enum_rstbl to suppress static checker [+ + +]

Author: Konstantin Komarov <[email protected]>
Date:   Tue Jul 23 16:51:18 2024 +0300

    fs/ntfs3: Refactor enum_rstbl to suppress static checker
    
    [ Upstream commit 56c16d5459d5c050a97a138a00a82b105a8e0a66 ]
    
    Comments and brief description of function enum_rstbl added.
    
    Fixes: b46acd6a6a62 ("fs/ntfs3: Add NTFS journal")
    Reported-by: Dan Carpenter <[email protected]>
    Signed-off-by: Konstantin Komarov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

fs/proc/kcore.c: allow translation of physical memory addresses [+ + +]

Author: Alexander Gordeev <[email protected]>
Date:   Mon Sep 30 14:21:19 2024 +0200

    fs/proc/kcore.c: allow translation of physical memory addresses
    
    commit 3d5854d75e3187147613130561b58f0b06166172 upstream.
    
    When /proc/kcore is read an attempt to read the first two pages results in
    HW-specific page swap on s390 and another (so called prefix) pages are
    accessed instead.  That leads to a wrong read.
    
    Allow architecture-specific translation of memory addresses using
    kc_xlate_dev_mem_ptr() and kc_unxlate_dev_mem_ptr() callbacks similarily
    to /dev/mem xlate_dev_mem_ptr() and unxlate_dev_mem_ptr() callbacks.  That
    way an architecture can deal with specific physical memory ranges.
    
    Re-use the existing /dev/mem callback implementation on s390, which
    handles the described prefix pages swapping correctly.
    
    For other architectures the default callback is basically NOP.  It is
    expected the condition (vaddr == __va(__pa(vaddr))) always holds true for
    KCORE_RAM memory type.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Alexander Gordeev <[email protected]>
    Suggested-by: Heiko Carstens <[email protected]>
    Cc: Vasily Gorbik <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

gfs2: qd_check_sync cleanups [+ + +]

Author: Andreas Gruenbacher <[email protected]>
Date:   Fri Jun 7 02:23:54 2024 +0200

    gfs2: qd_check_sync cleanups
    
    [ Upstream commit 59ebc33201237bf38e5adca3794716100660c5b4 ]
    
    Rename qd_check_sync() to qd_grab_sync() and make it return a bool.
    Turn the sync_gen pointer into a regular u64 and pass in U64_MAX instead
    of a NULL pointer when sync generation checking isn't needed.
    
    Introduce a new qd_ungrab_sync() helper for undoing the effects of
    qd_grab_sync() if the subsequent bh_get() on the qd object fails.
    
    Signed-off-by: Andreas Gruenbacher <[email protected]>
    Stable-dep-of: 4b4b6374dc61 ("gfs2: Revert "ignore negated quota changes"")
    Signed-off-by: Sasha Levin <[email protected]>

gfs2: Revert "ignore negated quota changes" [+ + +]

Author: Andreas Gruenbacher <[email protected]>
Date:   Mon Jun 3 19:04:09 2024 +0200

    gfs2: Revert "ignore negated quota changes"
    
    [ Upstream commit 4b4b6374dc6134849f2bdca81fa2945b6ed6d9fc ]
    
    Commit 4c6a08125f22 ("gfs2: ignore negated quota changes") skips quota
    changes with qd_change == 0 instead of writing them back, which leaves
    behind non-zero qd_change values in the affected slots.  The kernel then
    assumes that those slots are unused, while the qd_change values on disk
    indicate that they are indeed still in use.  The next time the
    filesystem is mounted, those invalid slots are read in from disk, which
    will cause inconsistencies.
    
    Revert that commit to avoid filesystem corruption.
    
    This reverts commit 4c6a08125f2249531ec01783a5f4317d7342add5.
    
    Signed-off-by: Andreas Gruenbacher <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

gfs2: Revert "introduce qd_bh_get_or_undo" [+ + +]

Author: Andreas Gruenbacher <[email protected]>
Date:   Fri Jun 7 02:11:12 2024 +0200

    gfs2: Revert "introduce qd_bh_get_or_undo"
    
    [ Upstream commit 2aedfe847b4d91eabee11a44c27244055cef4eb3 ]
    
    The qd_bh_get_or_undo() helper introduced by that commit doesn't improve
    the code much, so revert it and clean things up in a more useful way in
    the next commit.
    
    This reverts commit 7dbc6ae60dd7089d8ed42892b6a66c138f0aa7a0.
    
    Signed-off-by: Andreas Gruenbacher <[email protected]>
    Stable-dep-of: 4b4b6374dc61 ("gfs2: Revert "ignore negated quota changes"")
    Signed-off-by: Sasha Levin <[email protected]>

gpio: aspeed: Add the flush write to ensure the write complete. [+ + +]

Author: Billy Tsai <[email protected]>
Date:   Tue Oct 8 16:14:44 2024 +0800

    gpio: aspeed: Add the flush write to ensure the write complete.
    
    [ Upstream commit 1bb5a99e1f3fd27accb804aa0443a789161f843c ]
    
    Performing a dummy read ensures that the register write operation is fully
    completed, mitigating any potential bus delays that could otherwise impact
    the frequency of bitbang usage. E.g., if the JTAG application uses GPIO to
    control the JTAG pins (TCK, TMS, TDI, TDO, and TRST), and the application
    sets the TCK clock to 1 MHz, the GPIO's high/low transitions will rely on
    a delay function to ensure the clock frequency does not exceed 1 MHz.
    However, this can lead to rapid toggling of the GPIO because the write
    operation is POSTed and does not wait for a bus acknowledgment.
    
    Fixes: 361b79119a4b ("gpio: Add Aspeed driver")
    Reviewed-by: Andrew Jeffery <[email protected]>
    Signed-off-by: Billy Tsai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

gpio: aspeed: Use devm_clk api to manage clock source [+ + +]

Author: Billy Tsai <[email protected]>
Date:   Tue Oct 8 16:14:45 2024 +0800

    gpio: aspeed: Use devm_clk api to manage clock source
    
    [ Upstream commit a6191a3d18119184237f4ee600039081ad992320 ]
    
    Replace of_clk_get with devm_clk_get_enabled to manage the clock source.
    
    Fixes: 5ae4cb94b313 ("gpio: aspeed: Add debounce support")
    Reviewed-by: Andrew Jeffery <[email protected]>
    Signed-off-by: Billy Tsai <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bartosz Golaszewski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hid-asus: add ROG Ally X prod ID to quirk list [+ + +]

Author: Luke D. Jones <[email protected]>
Date:   Thu Jul 25 10:31:25 2024 +1200

    hid-asus: add ROG Ally X prod ID to quirk list
    
    [ Upstream commit d1aa95e86f178dc597e80228cd9bd81fc3510f34 ]
    
    The new ASUS ROG Ally X functions almost exactly the same as the previous
    model, so we can use the same quirks.
    
    Signed-off-by: Luke D. Jones <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: amd_sfh: Switch to device-managed dmam_alloc_coherent() [+ + +]

Author: Basavaraj Natikar <[email protected]>
Date:   Wed Oct 9 20:17:57 2024 +0530

    HID: amd_sfh: Switch to device-managed dmam_alloc_coherent()
    
    commit c56f9ecb7fb6a3a90079c19eb4c8daf3bbf514b3 upstream.
    
    Using the device-managed version allows to simplify clean-up in probe()
    error path.
    
    Additionally, this device-managed ensures proper cleanup, which helps to
    resolve memory errors, page faults, btrfs going read-only, and btrfs
    disk corruption.
    
    Fixes: 4b2c53d93a4b ("SFH:Transport Driver to add support of AMD Sensor Fusion Hub (SFH)")
    Tested-by: Chris Hixon <[email protected]>
    Tested-by: Richard <[email protected]>
    Tested-by: Skyler <[email protected]>
    Reported-by: Chris Hixon <[email protected]>
    Closes: https://lore.kernel.org/all/[email protected]/
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=219331
    Signed-off-by: Basavaraj Natikar <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: asus: add ROG Ally N-Key ID and keycodes [+ + +]

Author: Luke D. Jones <[email protected]>
Date:   Tue Apr 16 21:04:01 2024 +1200

    HID: asus: add ROG Ally N-Key ID and keycodes
    
    [ Upstream commit 08b50c6b0b0940a304b481346cc187d489c6a751 ]
    
    A handful of buttons on the ROG Ally are not actually part of the xpad
    device and are instead keyboard keys (a typical use of the MCU that asus
    uses). We attach a group of F<num> key codes which aren't used much and
    which the handheld community has already accepted as defaults here.
    
    Signed-off-by: Luke D. Jones <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: asus: add ROG Z13 lightbar [+ + +]

Author: Luke D. Jones <[email protected]>
Date:   Tue Apr 16 21:04:02 2024 +1200

    HID: asus: add ROG Z13 lightbar
    
    [ Upstream commit e901f10adb1f387fff1082297065a0da0191b83d ]
    
    Add init of the lightbar which is a small panel on the back of the ASUS
    ROG Z13 and uses the same MCU as keyboards.
    
    Signed-off-by: Luke D. Jones <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: i2c-hid: Remove I2C_HID_QUIRK_SET_PWR_WAKEUP_DEV quirk [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Sat Dec 2 23:46:13 2023 +0100

    HID: i2c-hid: Remove I2C_HID_QUIRK_SET_PWR_WAKEUP_DEV quirk
    
    [ Upstream commit bd008acdac45011f2246ec2518ef19c2da9e6008 ]
    
    Re-trying the power-on command on failure on all devices should
    not be a problem, drop the I2C_HID_QUIRK_SET_PWR_WAKEUP_DEV quirk
    and simply retry power-on on all devices.
    
    Reviewed-by: Douglas Anderson <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Stable-dep-of: 26dd6a5667f5 ("HID: i2c-hid: Skip SET_POWER SLEEP for Cirque touchpad on system suspend")
    Signed-off-by: Sasha Levin <[email protected]>

HID: i2c-hid: Renumber I2C_HID_QUIRK_ defines [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Sat Dec 2 23:46:14 2023 +0100

    HID: i2c-hid: Renumber I2C_HID_QUIRK_ defines
    
    [ Upstream commit 7d7a252842ecafb9b4541dc8470907e97bc6df62 ]
    
    The quirks variable and the I2C_HID_QUIRK_ defines are never used /
    exported outside of the i2c-hid code renumber them to start at
    BIT(0) again.
    
    Reviewed-by: Douglas Anderson <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Stable-dep-of: 26dd6a5667f5 ("HID: i2c-hid: Skip SET_POWER SLEEP for Cirque touchpad on system suspend")
    Signed-off-by: Sasha Levin <[email protected]>

HID: i2c-hid: Skip SET_POWER SLEEP for Cirque touchpad on system suspend [+ + +]

Author: Kai-Heng Feng <[email protected]>
Date:   Mon Jan 15 12:50:51 2024 +0800

    HID: i2c-hid: Skip SET_POWER SLEEP for Cirque touchpad on system suspend
    
    [ Upstream commit 26dd6a5667f500c5d991f90a9ac5998a71afaf5c ]
    
    There's a Cirque touchpad that wakes system up without anything touched
    the touchpad. The input report is empty when this happens.
    The reason is stated in HID over I2C spec, 7.2.8.2:
    "If the DEVICE wishes to wake the HOST from its low power state, it can
    issue a wake by asserting the interrupt."
    
    This is fine if OS can put system back to suspend by identifying input
    wakeup count stays the same on resume, like Chrome OS Dark Resume [0].
    But for regular distro such policy is lacking.
    
    Though the change doesn't bring any impact on power consumption for
    touchpad is minimal, other i2c-hid device may depends on SLEEP control
    power. So use a quirk to limit the change scope.
    
    [0] https://chromium.googlesource.com/chromiumos/platform2/+/HEAD/power_manager/docs/dark_resume.md
    
    Signed-off-by: Kai-Heng Feng <[email protected]>
    Reviewed-by: Douglas Anderson <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hid: intel-ish-hid: Fix uninitialized variable 'rv' in ish_fw_xfer_direct_dma [+ + +]

Author: SurajSonawane2415 <[email protected]>
Date:   Fri Oct 4 13:29:44 2024 +0530

    hid: intel-ish-hid: Fix uninitialized variable 'rv' in ish_fw_xfer_direct_dma
    
    commit d41bff05a61fb539f21e9bf0d39fac77f457434e upstream.
    
    Fix the uninitialized symbol 'rv' in the function ish_fw_xfer_direct_dma
    to resolve the following warning from the smatch tool:
    drivers/hid/intel-ish-hid/ishtp-fw-loader.c:714 ish_fw_xfer_direct_dma()
    error: uninitialized symbol 'rv'.
    Initialize 'rv' to 0 to prevent undefined behavior from uninitialized
    access.
    
    Cc: [email protected]
    Fixes: 91b228107da3 ("HID: intel-ish-hid: ISH firmware loader client driver")
    Signed-off-by: SurajSonawane2415 <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Benjamin Tissoires <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: mcp2200: added driver for GPIOs of MCP2200 [+ + +]

Author: Johannes Roith <[email protected]>
Date:   Thu Sep 21 18:49:28 2023 +0200

    HID: mcp2200: added driver for GPIOs of MCP2200
    
    [ Upstream commit 740329d7120f8608ead64b0f3417c02ca1d6b32f ]
    
    Added a gpiochip compatible driver to control the 8 GPIOs of
    the MCP2200 by using the HID interface.
    
    Using GPIOs with alternative functions (GP0<->SSPND, GP1<->USBCFG,
    GP6<->RXLED, GP7<->TXLED) will reset the functions, if set (unset by
    default).
    
    The driver was tested while also using the UART of the chip. Setting
    and reading the GPIOs has no effect on the UART communication. However,
    a reset is triggered after the CONFIGURE command. If the GPIO Direction
    is constantly changed, this will affect the communication at low baud
    rates. This is a hardware problem of the MCP2200 and is not caused by
    the driver.
    
    Signed-off-by: Johannes Roith <[email protected]>
    Reviewed-by: Rahul Rameshbabu <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

HID: multitouch: Add support for lenovo Y9000P Touchpad [+ + +]

Author: He Lugang <[email protected]>
Date:   Tue Aug 27 10:56:05 2024 +0800

    HID: multitouch: Add support for lenovo Y9000P Touchpad
    
    commit 251efae73bd46b097deec4f9986d926813aed744 upstream.
    
    The 2024 Lenovo Y9000P which use GT7868Q chip also needs a fixup.
    The information of the chip is as follows:
    I2C HID v1.00 Mouse [GXTP5100:00 27C6:01E0]
    
    Signed-off-by: He Lugang <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: WangYuli <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

HID: plantronics: Workaround for an unexcepted opposite volume key [+ + +]

Author: Wade Wang <[email protected]>
Date:   Mon Sep 16 16:56:00 2024 +0800

    HID: plantronics: Workaround for an unexcepted opposite volume key
    
    commit 87b696209007b7c4ef7bdfe39ea0253404a43770 upstream.
    
    Some Plantronics headset as the below send an unexcept opposite
    volume key's HID report for each volume key press after 200ms, like
    unecepted Volume Up Key following Volume Down key pressed by user.
    This patch adds a quirk to hid-plantronics for these devices, which
    will ignore the second unexcepted opposite volume key if it happens
    within 220ms from the last one that was handled.
        Plantronics EncorePro 500 Series  (047f:431e)
        Plantronics Blackwire_3325 Series (047f:430c)
    
    The patch was tested on the mentioned model, it shouldn't affect
    other models, however, this quirk might be needed for them too.
    Auto-repeat (when a key is held pressed) is not affected per test
    result.
    
    Cc: [email protected]
    Signed-off-by: Wade Wang <[email protected]>
    Signed-off-by: Jiri Kosina <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

hwmon: (adm9240) Add missing dependency on REGMAP_I2C [+ + +]

Author: Javier Carrasco <[email protected]>
Date:   Wed Oct 2 03:08:08 2024 +0200

    hwmon: (adm9240) Add missing dependency on REGMAP_I2C
    
    [ Upstream commit 14849a2ec175bb8a2280ce20efe002bb19f1e274 ]
    
    This driver requires REGMAP_I2C to be selected in order to get access to
    regmap_config and devm_regmap_init_i2c. Add the missing dependency.
    
    Fixes: df885d912f67 ("hwmon: (adm9240) Convert to regmap")
    Signed-off-by: Javier Carrasco <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hwmon: (adt7470) Add missing dependency on REGMAP_I2C [+ + +]

Author: Javier Carrasco <[email protected]>
Date:   Wed Oct 2 03:08:09 2024 +0200

    hwmon: (adt7470) Add missing dependency on REGMAP_I2C
    
    [ Upstream commit b6abcc19566509ab4812bd5ae5df46515d0c1d70 ]
    
    This driver requires REGMAP_I2C to be selected in order to get access to
    regmap_config and devm_regmap_init_i2c. Add the missing dependency.
    
    Fixes: ef67959c4253 ("hwmon: (adt7470) Convert to use regmap")
    Signed-off-by: Javier Carrasco <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hwmon: (mc34vr500) Add missing dependency on REGMAP_I2C [+ + +]

Author: Javier Carrasco <[email protected]>
Date:   Wed Oct 2 02:31:25 2024 +0200

    hwmon: (mc34vr500) Add missing dependency on REGMAP_I2C
    
    [ Upstream commit 56c77c0f4a7c9043e7d1d94e0aace264361e6717 ]
    
    This driver requires REGMAP_I2C to be selected in order to get access to
    regmap_config and devm_regmap_init_i2c. Add the missing dependency.
    
    Fixes: 07830d9ab34c ("hwmon: add initial NXP MC34VR500 PMIC monitoring support")
    Signed-off-by: Javier Carrasco <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hwmon: (tmp513) Add missing dependency on REGMAP_I2C [+ + +]

Author: Guenter Roeck <[email protected]>
Date:   Tue Oct 1 11:37:15 2024 -0700

    hwmon: (tmp513) Add missing dependency on REGMAP_I2C
    
    [ Upstream commit 193bc02c664999581a1f38c152f379fce91afc0c ]
    
    0-day reports:
    
    drivers/hwmon/tmp513.c:162:21: error:
            variable 'tmp51x_regmap_config' has initializer but incomplete type
    162 | static const struct regmap_config tmp51x_regmap_config = {
        |                     ^
    
    struct regmap_config is only available if REGMAP is enabled.
    Add the missing Kconfig dependency to fix the problem.
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
    Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.")
    Cc: Eric Tremblay <[email protected]>
    Reviewed-by: Javier Carrasco <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

hwmon: intel-m10-bmc-hwmon: relabel Columbiaville to CVL Die Temperature [+ + +]

Author: Peter Colberg <[email protected]>
Date:   Thu Sep 19 13:34:17 2024 -0400

    hwmon: intel-m10-bmc-hwmon: relabel Columbiaville to CVL Die Temperature
    
    [ Upstream commit a017616fafc6b2a6b3043bf46f6381ef2611c188 ]
    
    Consistently use CVL instead of Columbiaville, since CVL is already
    being used in all other sensor labels for the Intel N6000 card.
    
    Fixes: e1983220ae14 ("hwmon: intel-m10-bmc-hwmon: Add N6000 sensors")
    Signed-off-by: Peter Colberg <[email protected]>
    Reviewed-by: Michael Adler <[email protected]>
    Message-ID: <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i2c: i801: Use a different adapter-name for IDF adapters [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Mon Aug 12 22:39:48 2024 +0200

    i2c: i801: Use a different adapter-name for IDF adapters
    
    [ Upstream commit 43457ada98c824f310adb7bd96bd5f2fcd9a3279 ]
    
    On chipsets with a second 'Integrated Device Function' SMBus controller use
    a different adapter-name for the second IDF adapter.
    
    This allows platform glue code which is looking for the primary i801
    adapter to manually instantiate i2c_clients on to differentiate
    between the 2.
    
    This allows such code to find the primary i801 adapter by name, without
    needing to duplicate the PCI-ids to feature-flags mapping from i2c-i801.c.
    
    Reviewed-by: Pali Rohár <[email protected]>
    Signed-off-by: Hans de Goede <[email protected]>
    Acked-by: Wolfram Sang <[email protected]>
    Signed-off-by: Andi Shyti <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i3c: master: cdns: Fix use after free vulnerability in cdns_i3c_master Driver Due to Race Condition [+ + +]

Author: Kaixin Wang <[email protected]>
Date:   Wed Sep 11 23:35:44 2024 +0800

    i3c: master: cdns: Fix use after free vulnerability in cdns_i3c_master Driver Due to Race Condition
    
    [ Upstream commit 609366e7a06d035990df78f1562291c3bf0d4a12 ]
    
    In the cdns_i3c_master_probe function, &master->hj_work is bound with
    cdns_i3c_master_hj. And cdns_i3c_master_interrupt can call
    cnds_i3c_master_demux_ibis function to start the work.
    
    If we remove the module which will call cdns_i3c_master_remove to
    make cleanup, it will free master->base through i3c_master_unregister
    while the work mentioned above will be used. The sequence of operations
    that may lead to a UAF bug is as follows:
    
    CPU0                                      CPU1
    
                                         | cdns_i3c_master_hj
    cdns_i3c_master_remove               |
    i3c_master_unregister(&master->base) |
    device_unregister(&master->dev)      |
    device_release                       |
    //free master->base                  |
                                         | i3c_master_do_daa(&master->base)
                                         | //use master->base
    
    Fix it by ensuring that the work is canceled before proceeding with
    the cleanup in cdns_i3c_master_remove.
    
    Signed-off-by: Kaixin Wang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexandre Belloni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i40e: Fix macvlan leak by synchronizing access to mac_filter_hash [+ + +]

Author: Aleksandr Loktionov <[email protected]>
Date:   Mon Sep 23 11:12:19 2024 +0200

    i40e: Fix macvlan leak by synchronizing access to mac_filter_hash
    
    [ Upstream commit dac6c7b3d33756d6ce09f00a96ea2ecd79fae9fb ]
    
    This patch addresses a macvlan leak issue in the i40e driver caused by
    concurrent access to vsi->mac_filter_hash. The leak occurs when multiple
    threads attempt to modify the mac_filter_hash simultaneously, leading to
    inconsistent state and potential memory leaks.
    
    To fix this, we now wrap the calls to i40e_del_mac_filter() and zeroing
    vf->default_lan_addr.addr with spin_lock/unlock_bh(&vsi->mac_filter_hash_lock),
    ensuring atomic operations and preventing concurrent access.
    
    Additionally, we add lockdep_assert_held(&vsi->mac_filter_hash_lock) in
    i40e_add_mac_filter() to help catch similar issues in the future.
    
    Reproduction steps:
    1. Spawn VFs and configure port vlan on them.
    2. Trigger concurrent macvlan operations (e.g., adding and deleting
            portvlan and/or mac filters).
    3. Observe the potential memory leak and inconsistent state in the
            mac_filter_hash.
    
    This synchronization ensures the integrity of the mac_filter_hash and prevents
    the described leak.
    
    Fixes: fed0d9f13266 ("i40e: Fix VF's MAC Address change on VM")
    Reviewed-by: Arkadiusz Kubalewski <[email protected]>
    Signed-off-by: Aleksandr Loktionov <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Rafal Romanowski <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i40e: Fix ST code value for Clause 45 [+ + +]

Author: Ivan Vecera <[email protected]>
Date:   Wed Nov 29 17:17:10 2023 +0100

    i40e: Fix ST code value for Clause 45
    
    [ Upstream commit 9b3daf2b0443eeba23c3888059342aec920dfd53 ]
    
    ST code value for clause 45 that has been changed by
    commit 8196b5fd6c73 ("i40e: Refactor I40E_MDIO_CLAUSE* macros")
    is currently wrong.
    
    The mentioned commit refactored ..MDIO_CLAUSE??_STCODE_MASK so
    their value is the same for both clauses. The value is correct
    for clause 22 but not for clause 45.
    
    Fix the issue by adding a parameter to I40E_GLGEN_MSCA_STCODE_MASK
    macro that specifies required value.
    
    Fixes: 8196b5fd6c73 ("i40e: Refactor I40E_MDIO_CLAUSE* macros")
    Signed-off-by: Ivan Vecera <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

i40e: Include types.h to some headers [+ + +]

Author: Tony Nguyen <[email protected]>
Date:   Wed Jan 17 09:25:32 2024 -0800

    i40e: Include types.h to some headers
    
    [ Upstream commit 9cfd3b502153810b66ac0ce47f1fba682228f2d2 ]
    
    Commit 56df345917c0 ("i40e: Remove circular header dependencies and fix
    headers") redistributed a number of includes from one large header file
    to the locations they were needed. In some environments, types.h is not
    included and causing compile issues. The driver should not rely on
    implicit inclusion from other locations; explicitly include it to these
    files.
    
    Snippet of issue. Entire log can be seen through the Closes: link.
    
    In file included from drivers/net/ethernet/intel/i40e/i40e_diag.h:7,
                     from drivers/net/ethernet/intel/i40e/i40e_diag.c:4:
    drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h:33:9: error: unknown type name '__le16'
       33 |         __le16 flags;
          |         ^~~~~~
    drivers/net/ethernet/intel/i40e/i40e_adminq_cmd.h:34:9: error: unknown type name '__le16'
       34 |         __le16 opcode;
          |         ^~~~~~
    ...
    drivers/net/ethernet/intel/i40e/i40e_diag.h:22:9: error: unknown type name 'u32'
       22 |         u32 elements;   /* number of elements if array */
          |         ^~~
    drivers/net/ethernet/intel/i40e/i40e_diag.h:23:9: error: unknown type name 'u32'
       23 |         u32 stride;     /* bytes between each element */
    
    Reported-by: Martin Zaharinov <[email protected]>
    Closes: https://lore.kernel.org/netdev/[email protected]/
    Fixes: 56df345917c0 ("i40e: Remove circular header dependencies and fix headers")
    Reviewed-by: Jesse Brandeburg <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ice: Fix netif_is_ice() in Safe Mode [+ + +]

Author: Marcin Szycik <[email protected]>
Date:   Tue Sep 24 12:04:24 2024 +0200

    ice: Fix netif_is_ice() in Safe Mode
    
    [ Upstream commit 8e60dbcbaaa177dacef55a61501790e201bf8c88 ]
    
    netif_is_ice() works by checking the pointer to netdev ops. However, it
    only checks for the default ice_netdev_ops, not ice_netdev_safe_mode_ops,
    so in Safe Mode it always returns false, which is unintuitive. While it
    doesn't look like netif_is_ice() is currently being called anywhere in Safe
    Mode, this could change and potentially lead to unexpected behaviour.
    
    Fixes: df006dd4b1dc ("ice: Add initial support framework for LAG")
    Reviewed-by: Przemek Kitszel <[email protected]>
    Signed-off-by: Marcin Szycik <[email protected]>
    Reviewed-by: Brett Creeley <[email protected]>
    Tested-by: Sujai Buvaneswaran <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ice: fix VLAN replay after reset [+ + +]

Author: Dave Ertman <[email protected]>
Date:   Wed Sep 18 14:02:56 2024 -0400

    ice: fix VLAN replay after reset
    
    [ Upstream commit 0eae2c136cb624e4050092feb59f18159b4f2512 ]
    
    There is a bug currently when there are more than one VLAN defined
    and any reset that affects the PF is initiated, after the reset rebuild
    no traffic will pass on any VLAN but the last one created.
    
    This is caused by the iteration though the VLANs during replay each
    clearing the vsi_map bitmap of the VSI that is being replayed.  The
    problem is that during rhe replay, the pointer to the vsi_map bitmap
    is used by each successive vlan to determine if it should be replayed
    on this VSI.
    
    The logic was that the replay of the VLAN would replace the bit in the map
    before the next VLAN would iterate through.  But, since the replay copies
    the old bitmap pointer to filt_replay_rules and creates a new one for the
    recreated VLANS, it does not do this, and leaves the old bitmap broken
    to be used to replay the remaining VLANs.
    
    Since the old bitmap will be cleaned up in post replay cleanup, there is
    no need to alter it and break following VLAN replay, so don't clear the
    bit.
    
    Fixes: 334cb0626de1 ("ice: Implement VSI replay framework")
    Reviewed-by: Przemek Kitszel <[email protected]>
    Signed-off-by: Dave Ertman <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ice: Flush FDB entries before reset [+ + +]

Author: Wojciech Drewek <[email protected]>
Date:   Fri Sep 27 14:38:01 2024 +0200

    ice: Flush FDB entries before reset
    
    [ Upstream commit fbcb968a98ac0b71f5a2bda2751d7a32d201f90d ]
    
    Triggering the reset while in switchdev mode causes
    errors[1]. Rules are already removed by this time
    because switch content is flushed in case of the reset.
    This means that rules were deleted from HW but SW
    still thinks they exist so when we get
    SWITCHDEV_FDB_DEL_TO_DEVICE notification we try to
    delete not existing rule.
    
    We can avoid these errors by clearing the rules
    early in the reset flow before they are removed from HW.
    Switchdev API will get notified that the rule was removed
    so we won't get SWITCHDEV_FDB_DEL_TO_DEVICE notification.
    Remove unnecessary ice_clear_sw_switch_recipes.
    
    [1]
    ice 0000:01:00.0: Failed to delete FDB forward rule, err: -2
    ice 0000:01:00.0: Failed to delete FDB guard rule, err: -2
    
    Fixes: 7c945a1a8e5f ("ice: Switchdev FDB events support")
    Reviewed-by: Mateusz Polchlopek <[email protected]>
    Signed-off-by: Wojciech Drewek <[email protected]>
    Tested-by: Sujai Buvaneswaran <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ice: rename switchdev to eswitch [+ + +]

Author: Michal Swiatkowski <[email protected]>
Date:   Tue Oct 24 13:09:15 2023 +0200

    ice: rename switchdev to eswitch
    
    [ Upstream commit 5a841e4eb8ed2fea91025b19af8a9ba544f63323 ]
    
    Eswitch is used as a prefix for related functions. Main structure
    storing all data related to eswitch should also be named as eswitch
    instead of ice_switchdev_info. Rename it.
    
    Also rename switchdev to eswitch where the context is not about eswitch
    mode.
    
    ::uplink_netdev was changed to netdev for simplicity. There is no other
    netdev in function scope so it is obvious.
    
    Reviewed-by: Wojciech Drewek <[email protected]>
    Reviewed-by: Piotr Raczynski <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Signed-off-by: Michal Swiatkowski <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Stable-dep-of: fbcb968a98ac ("ice: Flush FDB entries before reset")
    Signed-off-by: Sasha Levin <[email protected]>

ice: set correct dst VSI in only LAN filters [+ + +]

Author: Michal Swiatkowski <[email protected]>
Date:   Mon Aug 19 12:14:01 2024 +0200

    ice: set correct dst VSI in only LAN filters
    
    [ Upstream commit 839e3f9bee425c90a0423d14b102a42fe6635c73 ]
    
    The filters set that will reproduce the problem:
    $ tc filter add dev $VF0_PR ingress protocol arp prio 0 flower \
            skip_sw dst_mac ff:ff:ff:ff:ff:ff action mirred egress \
            redirect dev $PF0
    $ tc filter add dev $VF0_PR ingress protocol arp prio 0 flower \
            skip_sw dst_mac ff:ff:ff:ff:ff:ff src_mac 52:54:00:00:00:10 \
            action mirred egress mirror dev $VF1_PR
    
    Expected behaviour is to set all broadcast from VF0 to the LAN. If the
    src_mac match the value from filters, send packet to LAN and to VF1.
    
    In this case both LAN_EN and LB_EN flags in switch is set in case of
    packet matching both filters. As dst VSI for the only LAN enable bit is
    PF VSI, the packet is being seen on PF. To fix this change dst VSI to
    the source VSI. It will block receiving any packet even when LB_EN is
    set by switch, because local loopback is clear on VF VSI during normal
    operation.
    
    Side note: if the second filters action is redirect instead of mirror
    LAN_EN is clear, because switch is AND-ing LAN_EN from each matched
    filters and OR-ing LB_EN.
    
    Reviewed-by: Przemek Kitszel <[email protected]>
    Fixes: 73b483b79029 ("ice: Manage act flags for switchdev offloads")
    Signed-off-by: Michal Swiatkowski <[email protected]>
    Reviewed-by: Jacob Keller <[email protected]>
    Tested-by: Sujai Buvaneswaran <[email protected]>
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

igb: Do not bring the device up after non-fatal error [+ + +]

Author: Mohamed Khalfella <[email protected]>
Date:   Tue Sep 24 15:06:01 2024 -0600

    igb: Do not bring the device up after non-fatal error
    
    [ Upstream commit 330a699ecbfc9c26ec92c6310686da1230b4e7eb ]
    
    Commit 004d25060c78 ("igb: Fix igb_down hung on surprise removal")
    changed igb_io_error_detected() to ignore non-fatal pcie errors in order
    to avoid hung task that can happen when igb_down() is called multiple
    times. This caused an issue when processing transient non-fatal errors.
    igb_io_resume(), which is called after igb_io_error_detected(), assumes
    that device is brought down by igb_io_error_detected() if the interface
    is up. This resulted in panic with stacktrace below.
    
    [ T3256] igb 0000:09:00.0 haeth0: igb: haeth0 NIC Link is Down
    [  T292] pcieport 0000:00:1c.5: AER: Uncorrected (Non-Fatal) error received: 0000:09:00.0
    [  T292] igb 0000:09:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
    [  T292] igb 0000:09:00.0:   device [8086:1537] error status/mask=00004000/00000000
    [  T292] igb 0000:09:00.0:    [14] CmpltTO [  200.105524,009][  T292] igb 0000:09:00.0: AER:   TLP Header: 00000000 00000000 00000000 00000000
    [  T292] pcieport 0000:00:1c.5: AER: broadcast error_detected message
    [  T292] igb 0000:09:00.0: Non-correctable non-fatal error reported.
    [  T292] pcieport 0000:00:1c.5: AER: broadcast mmio_enabled message
    [  T292] pcieport 0000:00:1c.5: AER: broadcast resume message
    [  T292] ------------[ cut here ]------------
    [  T292] kernel BUG at net/core/dev.c:6539!
    [  T292] invalid opcode: 0000 [#1] PREEMPT SMP
    [  T292] RIP: 0010:napi_enable+0x37/0x40
    [  T292] Call Trace:
    [  T292]  <TASK>
    [  T292]  ? die+0x33/0x90
    [  T292]  ? do_trap+0xdc/0x110
    [  T292]  ? napi_enable+0x37/0x40
    [  T292]  ? do_error_trap+0x70/0xb0
    [  T292]  ? napi_enable+0x37/0x40
    [  T292]  ? napi_enable+0x37/0x40
    [  T292]  ? exc_invalid_op+0x4e/0x70
    [  T292]  ? napi_enable+0x37/0x40
    [  T292]  ? asm_exc_invalid_op+0x16/0x20
    [  T292]  ? napi_enable+0x37/0x40
    [  T292]  igb_up+0x41/0x150
    [  T292]  igb_io_resume+0x25/0x70
    [  T292]  report_resume+0x54/0x70
    [  T292]  ? report_frozen_detected+0x20/0x20
    [  T292]  pci_walk_bus+0x6c/0x90
    [  T292]  ? aer_print_port_info+0xa0/0xa0
    [  T292]  pcie_do_recovery+0x22f/0x380
    [  T292]  aer_process_err_devices+0x110/0x160
    [  T292]  aer_isr+0x1c1/0x1e0
    [  T292]  ? disable_irq_nosync+0x10/0x10
    [  T292]  irq_thread_fn+0x1a/0x60
    [  T292]  irq_thread+0xe3/0x1a0
    [  T292]  ? irq_set_affinity_notifier+0x120/0x120
    [  T292]  ? irq_affinity_notify+0x100/0x100
    [  T292]  kthread+0xe2/0x110
    [  T292]  ? kthread_complete_and_exit+0x20/0x20
    [  T292]  ret_from_fork+0x2d/0x50
    [  T292]  ? kthread_complete_and_exit+0x20/0x20
    [  T292]  ret_from_fork_asm+0x11/0x20
    [  T292]  </TASK>
    
    To fix this issue igb_io_resume() checks if the interface is running and
    the device is not down this means igb_io_error_detected() did not bring
    the device down and there is no need to bring it up.
    
    Signed-off-by: Mohamed Khalfella <[email protected]>
    Reviewed-by: Yuanyuan Zhong <[email protected]>
    Fixes: 004d25060c78 ("igb: Fix igb_down hung on surprise removal")
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Pucha Himasekhar Reddy <[email protected]> (A Contingent worker at Intel)
    Signed-off-by: Tony Nguyen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Input: synaptics-rmi4 - fix UAF of IRQ domain on driver removal [+ + +]

Author: Mathias Krause <[email protected]>
Date:   Wed Oct 9 05:40:12 2024 +0000

    Input: synaptics-rmi4 - fix UAF of IRQ domain on driver removal
    
    commit fbf8d71742557abaf558d8efb96742d442720cc2 upstream.
    
    Calling irq_domain_remove() will lead to freeing the IRQ domain
    prematurely. The domain is still referenced and will be attempted to get
    used via rmi_free_function_list() -> rmi_unregister_function() ->
    irq_dispose_mapping() -> irq_get_irq_data()'s ->domain pointer.
    
    With PaX's MEMORY_SANITIZE this will lead to an access fault when
    attempting to dereference embedded pointers, as in Torsten's report that
    was faulting on the 'domain->ops->unmap' test.
    
    Fix this by releasing the IRQ domain only after all related IRQs have
    been deactivated.
    
    Fixes: 24d28e4f1271 ("Input: synaptics-rmi4 - convert irq distribution to irq_domain")
    Reported-by: Torsten Hilbrich <[email protected]>
    Signed-off-by: Mathias Krause <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>
    Signed-off-by: Tzung-Bi Shih <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

io_uring: check if we need to reschedule during overflow flush [+ + +]

Author: Jens Axboe <[email protected]>
Date:   Fri Sep 20 02:51:20 2024 -0600

    io_uring: check if we need to reschedule during overflow flush
    
    [ Upstream commit eac2ca2d682f94f46b1973bdf5e77d85d77b8e53 ]
    
    In terms of normal application usage, this list will always be empty.
    And if an application does overflow a bit, it'll have a few entries.
    However, nothing obviously prevents syzbot from running a test case
    that generates a ton of overflow entries, and then flushing them can
    take quite a while.
    
    Check for needing to reschedule while flushing, and drop our locks and
    do so if necessary. There's no state to maintain here as overflows
    always prune from head-of-list, hence it's fine to drop and reacquire
    the locks at the end of the loop.
    
    Link: https://lore.kernel.org/io-uring/[email protected]/
    Reported-by: [email protected]
    Signed-off-by: Jens Axboe <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

jbd2: fix kernel-doc for j_transaction_overhead_buffers [+ + +]

Author: Randy Dunlap <[email protected]>
Date:   Mon Jul 22 22:16:47 2024 -0700

    jbd2: fix kernel-doc for j_transaction_overhead_buffers
    
    [ Upstream commit 7e8fb2eda9885ea2d13179a4c0bbf810f900ef25 ]
    
    Use the correct struct member name in the kernel-doc notation
    to prevent a kernel-doc build warning.
    
    include/linux/jbd2.h:1303: warning: Function parameter or struct member 'j_transaction_overhead_buffers' not described in 'journal_s'
    include/linux/jbd2.h:1303: warning: Excess struct member 'j_transaction_overhead' description in 'journal_s'
    
    Fixes: e3a00a23781c ("jbd2: precompute number of transaction descriptor blocks")
    Reported-by: Stephen Rothwell <[email protected]>
    Closes: https://lore.kernel.org/linux-next/[email protected]/
    Signed-off-by: Randy Dunlap <[email protected]>
    Reviewed-by: Jan Kara <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Theodore Ts'o <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ktest.pl: Avoid false positives with grub2 skip regex [+ + +]

Author: Daniel Jordan <[email protected]>
Date:   Wed Sep 4 13:55:30 2024 -0400

    ktest.pl: Avoid false positives with grub2 skip regex
    
    [ Upstream commit 2351e8c65404aabc433300b6bf90c7a37e8bbc4d ]
    
    Some distros have grub2 config files with the lines
    
        if [ x"${feature_menuentry_id}" = xy ]; then
          menuentry_id_option="--id"
        else
          menuentry_id_option=""
        fi
    
    which match the skip regex defined for grub2 in get_grub_index():
    
        $skip = '^\s*menuentry';
    
    These false positives cause the grub number to be higher than it
    should be, and the wrong kernel can end up booting.
    
    Grub documents the menuentry command with whitespace between it and the
    title, so make the skip regex reflect this.
    
    Link: https://lore.kernel.org/[email protected]
    Signed-off-by: Daniel Jordan <[email protected]>
    Acked-by: John 'Warthog9' Hawley (Tenstorrent) <[email protected]>
    Signed-off-by: Steven Rostedt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

kthread: unpark only parked kthread [+ + +]

Author: Frederic Weisbecker <[email protected]>
Date:   Fri Sep 13 23:46:34 2024 +0200

    kthread: unpark only parked kthread
    
    commit 214e01ad4ed7158cab66498810094fac5d09b218 upstream.
    
    Calling into kthread unparking unconditionally is mostly harmless when
    the kthread is already unparked. The wake up is then simply ignored
    because the target is not in TASK_PARKED state.
    
    However if the kthread is per CPU, the wake up is preceded by a call
    to kthread_bind() which expects the task to be inactive and in
    TASK_PARKED state, which obviously isn't the case if it is unparked.
    
    As a result, calling kthread_stop() on an unparked per-cpu kthread
    triggers such a warning:
    
            WARNING: CPU: 0 PID: 11 at kernel/kthread.c:525 __kthread_bind_mask kernel/kthread.c:525
             <TASK>
             kthread_stop+0x17a/0x630 kernel/kthread.c:707
             destroy_workqueue+0x136/0xc40 kernel/workqueue.c:5810
             wg_destruct+0x1e2/0x2e0 drivers/net/wireguard/device.c:257
             netdev_run_todo+0xe1a/0x1000 net/core/dev.c:10693
             default_device_exit_batch+0xa14/0xa90 net/core/dev.c:11769
             ops_exit_list net/core/net_namespace.c:178 [inline]
             cleanup_net+0x89d/0xcc0 net/core/net_namespace.c:640
             process_one_work kernel/workqueue.c:3231 [inline]
             process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
             worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
             kthread+0x2f0/0x390 kernel/kthread.c:389
             ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
             ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
             </TASK>
    
    Fix this with skipping unecessary unparking while stopping a kthread.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 5c25b5ff89f0 ("workqueue: Tag bound workers with KTHREAD_IS_PER_CPU")
    Signed-off-by: Frederic Weisbecker <[email protected]>
    Reported-by: [email protected]
    Tested-by: [email protected]
    Suggested-by: Thomas Gleixner <[email protected]>
    Cc: Hillf Danton <[email protected]>
    Cc: Tejun Heo <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat [+ + +]

Author: Paul Menzel <[email protected]>
Date:   Mon Jul 1 17:58:01 2024 +0200

    lib/build_OID_registry: avoid non-destructive substitution for Perl < 5.13.2 compat
    
    [ Upstream commit 2fe29fe945637b9834c5569fbb1c9d4f881d8263 ]
    
    On a system with Perl 5.12.1, commit 5ef6dc08cfde
    ("lib/build_OID_registry: don't mention the full path of the script in
    output") causes the build to fail with the error below.
    
         Bareword found where operator expected at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
         syntax error at ./lib/build_OID_registry line 41, near "s#^\Q$abs_srctree/\E##r"
         Execution of ./lib/build_OID_registry aborted due to compilation errors.
         make[3]: *** [lib/Makefile:352: lib/oid_registry_data.c] Error 255
    
    Ahmad Fatoum analyzed that non-destructive substitution is only supported since
    Perl 5.13.2. Instead of dropping `r` and having the side effect of modifying
    `$0`, introduce a dedicated variable to support older Perl versions.
    
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 5ef6dc08cfde ("lib/build_OID_registry: don't mention the full path of the script in output")
    Link: https://lore.kernel.org/all/[email protected]/
    Signed-off-by: Paul Menzel <[email protected]>
    Suggested-by: Ahmad Fatoum <[email protected]>
    Cc: Uwe Kleine-König <[email protected]>
    Cc: Nicolas Schier <[email protected]>
    Cc: Masahiro Yamada <[email protected]>
    Cc: Ahmad Fatoum <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

libceph: init the cursor when preparing sparse read in msgr2 [+ + +]

Author: Xiubo Li <[email protected]>
Date:   Wed Mar 6 09:05:44 2024 +0800

    libceph: init the cursor when preparing sparse read in msgr2
    
    [ Upstream commit 321e3c3de53c7530cd518219d01f04e7e32a9d23 ]
    
    The cursor is no longer initialized in the OSD client, causing the
    sparse read state machine to fall into an infinite loop.  The cursor
    should be initialized in IN_S_PREPARE_SPARSE_DATA state.
    
    [ idryomov: use msg instead of con->in_msg, changelog ]
    
    Link: https://tracker.ceph.com/issues/64607
    Fixes: 8e46a2d068c9 ("libceph: just wait for more data to be available on the socket")
    Signed-off-by: Xiubo Li <[email protected]>
    Reviewed-by: Ilya Dryomov <[email protected]>
    Tested-by: Luis Henriques <[email protected]>
    Signed-off-by: Ilya Dryomov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

libsubcmd: Don't free the usage string [+ + +]

Author: Aditya Gupta <[email protected]>
Date:   Wed Sep 4 11:48:30 2024 +0530

    libsubcmd: Don't free the usage string
    
    [ Upstream commit 1a5efc9e13f357abc396dbf445b25d08914c8060 ]
    
    Currently, commands which depend on 'parse_options_subcommand()' don't
    show the usage string, and instead show '(null)'
    
        $ ./perf sched
            Usage: (null)
    
        -D, --dump-raw-trace  dump raw trace in ASCII
        -f, --force           don't complain, do it
        -i, --input <file>    input file name
        -v, --verbose         be more verbose (show symbol address, etc)
    
    'parse_options_subcommand()' is generally expected to initialise the usage
    string, with information in the passed 'subcommands[]' array
    
    This behaviour was changed in:
    
      230a7a71f92212e7 ("libsubcmd: Fix parse-options memory leak")
    
    Where the generated usage string is deallocated, and usage[0] string is
    reassigned as NULL.
    
    As discussed in [1], free the allocated usage string in the main
    function itself, and don't reset usage string to NULL in
    parse_options_subcommand
    
    With this change, the behaviour is restored.
    
        $ ./perf sched
            Usage: perf sched [<options>] {record|latency|map|replay|script|timehist}
    
               -D, --dump-raw-trace  dump raw trace in ASCII
               -f, --force           don't complain, do it
               -i, --input <file>    input file name
               -v, --verbose         be more verbose (show symbol address, etc)
    
    [1]: https://lore.kernel.org/linux-perf-users/htq5vhx6piet4nuq2mmhk7fs2bhfykv52dbppwxmo3s7du2odf@styd27tioc6e/
    
    Fixes: 230a7a71f92212e7 ("libsubcmd: Fix parse-options memory leak")
    Suggested-by: Namhyung Kim <[email protected]>
    Signed-off-by: Aditya Gupta <[email protected]>
    Acked-by: Namhyung Kim <[email protected]>
    Tested-by: Arnaldo Carvalho de Melo <[email protected]>
    Cc: Athira Rajeev <[email protected]>
    Cc: Disha Goel <[email protected]>
    Cc: Ian Rogers <[email protected]>
    Cc: Jiri Olsa <[email protected]>
    Cc: Kajol Jain <[email protected]>
    Cc: Madhavan Srinivasan <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Linux: Linux 6.6.57 [+ + +]

Author: Greg Kroah-Hartman <[email protected]>
Date:   Thu Oct 17 15:24:38 2024 +0200

    Linux 6.6.57
    
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Peter Schneider <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Tested-by: Shuah Khan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Tested-by: Takeshi Ogasawara <[email protected]>
    Tested-by: Ron Economos <[email protected]>
    Tested-by: Florian Fainelli <[email protected]>
    Tested-by: Kexy Biscuit <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Jon Hunter <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

LoongArch: Fix memleak in pci_acpi_scan_root() [+ + +]

Author: Wentao Guan <[email protected]>
Date:   Tue Sep 24 15:32:20 2024 +0800

    LoongArch: Fix memleak in pci_acpi_scan_root()
    
    [ Upstream commit 5016c3a31a6d74eaf2fdfdec673eae8fcf90379e ]
    
    Add kfree(root_ops) in this case to avoid memleak of root_ops,
    leaks when pci_find_bus() != 0.
    
    Signed-off-by: Yuli Wang <[email protected]>
    Signed-off-by: Wentao Guan <[email protected]>
    Signed-off-by: Huacai Chen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mctp: Handle error of rtnl_register_module(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Oct 8 11:47:35 2024 -0700

    mctp: Handle error of rtnl_register_module().
    
    [ Upstream commit d51705614f668254cc5def7490df76f9680b4659 ]
    
    Since introduced, mctp has been ignoring the returned value of
    rtnl_register_module(), which could fail silently.
    
    Handling the error allows users to view a module as an all-or-nothing
    thing in terms of the rtnetlink functionality.  This prevents syzkaller
    from reporting spurious errors from its tests, where OOM often occurs
    and module is automatically loaded.
    
    Let's handle the errors by rtnl_register_many().
    
    Fixes: 583be982d934 ("mctp: Add device handling and netlink interface")
    Fixes: 831119f88781 ("mctp: Add neighbour netlink interface")
    Fixes: 06d2f4c583a7 ("mctp: Add netlink route management")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Reviewed-by: Jeremy Kerr <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put() [+ + +]

Author: Yunke Cao <[email protected]>
Date:   Wed Aug 14 11:06:40 2024 +0900

    media: videobuf2-core: clear memory related fields in __vb2_plane_dmabuf_put()
    
    [ Upstream commit 6a9c97ab6b7e85697e0b74e86062192a5ffffd99 ]
    
    Clear vb2_plane's memory related fields in __vb2_plane_dmabuf_put(),
    including bytesused, length, fd and data_offset.
    
    Remove the duplicated code in __prepare_dmabuf().
    
    Signed-off-by: Yunke Cao <[email protected]>
    Acked-by: Tomasz Figa <[email protected]>
    Signed-off-by: Hans Verkuil <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mfd: intel_soc_pmic_chtwc: Make Lenovo Yoga Tab 3 X90F DMI match less strict [+ + +]

Author: Hans de Goede <[email protected]>
Date:   Sun Aug 25 15:26:17 2024 +0200

    mfd: intel_soc_pmic_chtwc: Make Lenovo Yoga Tab 3 X90F DMI match less strict
    
    [ Upstream commit ae7eee56cdcfcb6a886f76232778d6517fd58690 ]
    
    There are 2G and 4G RAM versions of the Lenovo Yoga Tab 3 X90F and it
    turns out that the 2G version has a DMI product name of
    "CHERRYVIEW D1 PLATFORM" where as the 4G version has
    "CHERRYVIEW C0 PLATFORM". The sys-vendor + product-version check are
    unique enough that the product-name check is not necessary.
    
    Drop the product-name check so that the existing DMI match for the 4G
    RAM version also matches the 2G RAM version.
    
    Signed-off-by: Hans de Goede <[email protected]>
    Reviewed-by: Andy Shevchenko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lee Jones <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mpls: Handle error of rtnl_register_module(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Oct 8 11:47:36 2024 -0700

    mpls: Handle error of rtnl_register_module().
    
    [ Upstream commit 5be2062e3080e3ff6707816caa445ec0c6eaacf7 ]
    
    Since introduced, mpls_init() has been ignoring the returned
    value of rtnl_register_module(), which could fail silently.
    
    Handling the error allows users to view a module as an all-or-nothing
    thing in terms of the rtnetlink functionality.  This prevents syzkaller
    from reporting spurious errors from its tests, where OOM often occurs
    and module is automatically loaded.
    
    Let's handle the errors by rtnl_register_many().
    
    Fixes: 03c0566542f4 ("mpls: Netlink commands to add, remove, and dump routes")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

mpls: no longer hold RTNL in mpls_netconf_dump_devconf() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Wed Apr 10 11:19:50 2024 +0000

    mpls: no longer hold RTNL in mpls_netconf_dump_devconf()
    
    [ Upstream commit e0f89d2864b062b027196925ea19f94b2ce50d6a ]
    
    - Use for_each_netdev_dump() to no longer rely
      on net->dev_index_head hash table.
    
    - No longer care of net->dev_base_seq
    
    - Fix return value at the end of a dump,
      so that NLMSG_DONE can be appended to current skb,
      saving one recvmsg() system call.
    
    - No longer grab RTNL, RCU protection is enough,
      afer adding one READ_ONCE(mdev->input_enabled)
      in mpls_netconf_fill_devconf()
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Stable-dep-of: 5be2062e3080 ("mpls: Handle error of rtnl_register_module().")
    Signed-off-by: Sasha Levin <[email protected]>

mptcp: fallback when MPTCP opts are dropped after 1st data [+ + +]

Author: Matthieu Baerts (NGI0) <[email protected]>
Date:   Tue Oct 8 13:04:54 2024 +0200

    mptcp: fallback when MPTCP opts are dropped after 1st data
    
    commit 119d51e225febc8152476340a880f5415a01e99e upstream.
    
    As reported by Christoph [1], before this patch, an MPTCP connection was
    wrongly reset when a host received a first data packet with MPTCP
    options after the 3wHS, but got the next ones without.
    
    According to the MPTCP v1 specs [2], a fallback should happen in this
    case, because the host didn't receive a DATA_ACK from the other peer,
    nor receive data for more than the initial window which implies a
    DATA_ACK being received by the other peer.
    
    The patch here re-uses the same logic as the one used in other places:
    by looking at allow_infinite_fallback, which is disabled at the creation
    of an additional subflow. It's not looking at the first DATA_ACK (or
    implying one received from the other side) as suggested by the RFC, but
    it is in continuation with what was already done, which is safer, and it
    fixes the reported issue. The next step, looking at this first DATA_ACK,
    is tracked in [4].
    
    This patch has been validated using the following Packetdrill script:
    
       0 socket(..., SOCK_STREAM, IPPROTO_MPTCP) = 3
      +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
      +0 bind(3, ..., ...) = 0
      +0 listen(3, 1) = 0
    
      // 3WHS is OK
      +0.0 < S  0:0(0)       win 65535  <mss 1460, sackOK, nop, nop, nop, wscale 6, mpcapable v1 flags[flag_h] nokey>
      +0.0 > S. 0:0(0) ack 1            <mss 1460, nop, nop, sackOK, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey]>
      +0.1 <  . 1:1(0) ack 1 win 2048                                              <mpcapable v1 flags[flag_h] key[ckey=2, skey]>
      +0 accept(3, ..., ...) = 4
    
      // Data from the client with valid MPTCP options (no DATA_ACK: normal)
      +0.1 < P. 1:501(500) ack 1 win 2048 <mpcapable v1 flags[flag_h] key[skey, ckey] mpcdatalen 500, nop, nop>
      // From here, the MPTCP options will be dropped by a middlebox
      +0.0 >  . 1:1(0)     ack 501        <dss dack8=501 dll=0 nocs>
    
      +0.1 read(4, ..., 500) = 500
      +0   write(4, ..., 100) = 100
    
      // The server replies with data, still thinking MPTCP is being used
      +0.0 > P. 1:101(100)   ack 501          <dss dack8=501 dsn8=1 ssn=1 dll=100 nocs, nop, nop>
      // But the client already did a fallback to TCP, because the two previous packets have been received without MPTCP options
      +0.1 <  . 501:501(0)   ack 101 win 2048
    
      +0.0 < P. 501:601(100) ack 101 win 2048
      // The server should fallback to TCP, not reset: it didn't get a DATA_ACK, nor data for more than the initial window
      +0.0 >  . 101:101(0)   ack 601
    
    Note that this script requires Packetdrill with MPTCP support, see [3].
    
    Fixes: dea2b1ea9c70 ("mptcp: do not reset MP_CAPABLE subflow on mapping errors")
    Cc: [email protected]
    Reported-by: Christoph Paasch <[email protected]>
    Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/518 [1]
    Link: https://datatracker.ietf.org/doc/html/rfc8684#name-fallback [2]
    Link: https://github.com/multipath-tcp/packetdrill [3]
    Link: https://github.com/multipath-tcp/mptcp_net-next/issues/519 [4]
    Reviewed-by: Paolo Abeni <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mptcp: handle consistently DSS corruption [+ + +]

Author: Paolo Abeni <[email protected]>
Date:   Tue Oct 8 13:04:52 2024 +0200

    mptcp: handle consistently DSS corruption
    
    commit e32d262c89e2b22cb0640223f953b548617ed8a6 upstream.
    
    Bugged peer implementation can send corrupted DSS options, consistently
    hitting a few warning in the data path. Use DEBUG_NET assertions, to
    avoid the splat on some builds and handle consistently the error, dumping
    related MIBs and performing fallback and/or reset according to the
    subflow type.
    
    Fixes: 6771bfd9ee24 ("mptcp: update mptcp ack sequence from work queue")
    Cc: [email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

mptcp: pm: do not remove closing subflows [+ + +]

Author: Matthieu Baerts (NGI0) <[email protected]>
Date:   Tue Oct 8 13:04:55 2024 +0200

    mptcp: pm: do not remove closing subflows
    
    commit db0a37b7ac27d8ca27d3dc676a16d081c16ec7b9 upstream.
    
    In a previous fix, the in-kernel path-manager has been modified not to
    retrigger the removal of a subflow if it was already closed, e.g. when
    the initial subflow is removed, but kept in the subflows list.
    
    To be complete, this fix should also skip the subflows that are in any
    closing state: mptcp_close_ssk() will initiate the closure, but the
    switch to the TCP_CLOSE state depends on the other peer.
    
    Fixes: 58e1b66b4e4b ("mptcp: pm: do not remove already closed subflows")
    Cc: [email protected]
    Suggested-by: Paolo Abeni <[email protected]>
    Acked-by: Paolo Abeni <[email protected]>
    Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net/sched: accept TCA_STAB only for root qdisc [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Mon Oct 7 18:41:30 2024 +0000

    net/sched: accept TCA_STAB only for root qdisc
    
    [ Upstream commit 3cb7cf1540ddff5473d6baeb530228d19bc97b8a ]
    
    Most qdiscs maintain their backlog using qdisc_pkt_len(skb)
    on the assumption it is invariant between the enqueue()
    and dequeue() handlers.
    
    Unfortunately syzbot can crash a host rather easily using
    a TBF + SFQ combination, with an STAB on SFQ [1]
    
    We can't support TCA_STAB on arbitrary level, this would
    require to maintain per-qdisc storage.
    
    [1]
    [   88.796496] BUG: kernel NULL pointer dereference, address: 0000000000000000
    [   88.798611] #PF: supervisor read access in kernel mode
    [   88.799014] #PF: error_code(0x0000) - not-present page
    [   88.799506] PGD 0 P4D 0
    [   88.799829] Oops: Oops: 0000 [#1] SMP NOPTI
    [   88.800569] CPU: 14 UID: 0 PID: 2053 Comm: b371744477 Not tainted 6.12.0-rc1-virtme #1117
    [   88.801107] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
    [   88.801779] RIP: 0010:sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq
    [ 88.802544] Code: 0f b7 50 12 48 8d 04 d5 00 00 00 00 48 89 d6 48 29 d0 48 8b 91 c0 01 00 00 48 c1 e0 03 48 01 c2 66 83 7a 1a 00 7e c0 48 8b 3a <4c> 8b 07 4c 89 02 49 89 50 08 48 c7 47 08 00 00 00 00 48 c7 07 00
    All code
    ========
       0:   0f b7 50 12             movzwl 0x12(%rax),%edx
       4:   48 8d 04 d5 00 00 00    lea    0x0(,%rdx,8),%rax
       b:   00
       c:   48 89 d6                mov    %rdx,%rsi
       f:   48 29 d0                sub    %rdx,%rax
      12:   48 8b 91 c0 01 00 00    mov    0x1c0(%rcx),%rdx
      19:   48 c1 e0 03             shl    $0x3,%rax
      1d:   48 01 c2                add    %rax,%rdx
      20:   66 83 7a 1a 00          cmpw   $0x0,0x1a(%rdx)
      25:   7e c0                   jle    0xffffffffffffffe7
      27:   48 8b 3a                mov    (%rdx),%rdi
      2a:*  4c 8b 07                mov    (%rdi),%r8               <-- trapping instruction
      2d:   4c 89 02                mov    %r8,(%rdx)
      30:   49 89 50 08             mov    %rdx,0x8(%r8)
      34:   48 c7 47 08 00 00 00    movq   $0x0,0x8(%rdi)
      3b:   00
      3c:   48                      rex.W
      3d:   c7                      .byte 0xc7
      3e:   07                      (bad)
            ...
    
    Code starting with the faulting instruction
    ===========================================
       0:   4c 8b 07                mov    (%rdi),%r8
       3:   4c 89 02                mov    %r8,(%rdx)
       6:   49 89 50 08             mov    %rdx,0x8(%r8)
       a:   48 c7 47 08 00 00 00    movq   $0x0,0x8(%rdi)
      11:   00
      12:   48                      rex.W
      13:   c7                      .byte 0xc7
      14:   07                      (bad)
            ...
    [   88.803721] RSP: 0018:ffff9a1f892b7d58 EFLAGS: 00000206
    [   88.804032] RAX: 0000000000000000 RBX: ffff9a1f8420c800 RCX: ffff9a1f8420c800
    [   88.804560] RDX: ffff9a1f81bc1440 RSI: 0000000000000000 RDI: 0000000000000000
    [   88.805056] RBP: ffffffffc04bb0e0 R08: 0000000000000001 R09: 00000000ff7f9a1f
    [   88.805473] R10: 000000000001001b R11: 0000000000009a1f R12: 0000000000000140
    [   88.806194] R13: 0000000000000001 R14: ffff9a1f886df400 R15: ffff9a1f886df4ac
    [   88.806734] FS:  00007f445601a740(0000) GS:ffff9a2e7fd80000(0000) knlGS:0000000000000000
    [   88.807225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [   88.807672] CR2: 0000000000000000 CR3: 000000050cc46000 CR4: 00000000000006f0
    [   88.808165] Call Trace:
    [   88.808459]  <TASK>
    [   88.808710] ? __die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434)
    [   88.809261] ? page_fault_oops (arch/x86/mm/fault.c:715)
    [   88.809561] ? exc_page_fault (./arch/x86/include/asm/irqflags.h:26 ./arch/x86/include/asm/irqflags.h:87 ./arch/x86/include/asm/irqflags.h:147 arch/x86/mm/fault.c:1489 arch/x86/mm/fault.c:1539)
    [   88.809806] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
    [   88.810074] ? sfq_dequeue (net/sched/sch_sfq.c:272 net/sched/sch_sfq.c:499) sch_sfq
    [   88.810411] sfq_reset (net/sched/sch_sfq.c:525) sch_sfq
    [   88.810671] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036)
    [   88.810950] tbf_reset (./include/linux/timekeeping.h:169 net/sched/sch_tbf.c:334) sch_tbf
    [   88.811208] qdisc_reset (./include/linux/skbuff.h:2135 ./include/linux/skbuff.h:2441 ./include/linux/skbuff.h:3304 ./include/linux/skbuff.h:3310 net/sched/sch_generic.c:1036)
    [   88.811484] netif_set_real_num_tx_queues (./include/linux/spinlock.h:396 ./include/net/sch_generic.h:768 net/core/dev.c:2958)
    [   88.811870] __tun_detach (drivers/net/tun.c:590 drivers/net/tun.c:673)
    [   88.812271] tun_chr_close (drivers/net/tun.c:702 drivers/net/tun.c:3517)
    [   88.812505] __fput (fs/file_table.c:432 (discriminator 1))
    [   88.812735] task_work_run (kernel/task_work.c:230)
    [   88.813016] do_exit (kernel/exit.c:940)
    [   88.813372] ? trace_hardirqs_on (kernel/trace/trace_preemptirq.c:58 (discriminator 4))
    [   88.813639] ? handle_mm_fault (./arch/x86/include/asm/irqflags.h:42 ./arch/x86/include/asm/irqflags.h:97 ./arch/x86/include/asm/irqflags.h:155 ./include/linux/memcontrol.h:1022 ./include/linux/memcontrol.h:1045 ./include/linux/memcontrol.h:1052 mm/memory.c:5928 mm/memory.c:6088)
    [   88.813867] do_group_exit (kernel/exit.c:1070)
    [   88.814138] __x64_sys_exit_group (kernel/exit.c:1099)
    [   88.814490] x64_sys_call (??:?)
    [   88.814791] do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1) arch/x86/entry/common.c:83 (discriminator 1))
    [   88.815012] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
    [   88.815495] RIP: 0033:0x7f44560f1975
    
    Fixes: 175f9c1bba9b ("net_sched: Add size table for qdiscs")
    Reported-by: syzbot <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Daniel Borkmann <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: do not delay dst_entries_add() in dst_release() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Tue Oct 8 14:31:10 2024 +0000

    net: do not delay dst_entries_add() in dst_release()
    
    [ Upstream commit ac888d58869bb99753e7652be19a151df9ecb35d ]
    
    dst_entries_add() uses per-cpu data that might be freed at netns
    dismantle from ip6_route_net_exit() calling dst_entries_destroy()
    
    Before ip6_route_net_exit() can be called, we release all
    the dsts associated with this netns, via calls to dst_release(),
    which waits an rcu grace period before calling dst_destroy()
    
    dst_entries_add() use in dst_destroy() is racy, because
    dst_entries_destroy() could have been called already.
    
    Decrementing the number of dsts must happen sooner.
    
    Notes:
    
    1) in CONFIG_XFRM case, dst_destroy() can call
       dst_release_immediate(child), this might also cause UAF
       if the child does not have DST_NOCOUNT set.
       IPSEC maintainers might take a look and see how to address this.
    
    2) There is also discussion about removing this count of dst,
       which might happen in future kernels.
    
    Fixes: f88649721268 ("ipv4: fix dst race in sk_dst_get()")
    Closes: https://lore.kernel.org/lkml/CANn89iLCCGsP7SFn9HKpvnKu96Td4KD08xf7aGtiYgZnkjaL=w@mail.gmail.com/T/
    Reported-by: Naresh Kamboju <[email protected]>
    Tested-by: Linux Kernel Functional Testing <[email protected]>
    Tested-by: Naresh Kamboju <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Xin Long <[email protected]>
    Cc: Steffen Klassert <[email protected]>
    Reviewed-by: Xin Long <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: b53: allow lower MTUs on BCM5325/5365 [+ + +]

Author: Jonas Gorski <[email protected]>
Date:   Fri Oct 4 10:47:20 2024 +0200

    net: dsa: b53: allow lower MTUs on BCM5325/5365
    
    [ Upstream commit e4b294f88a32438baf31762441f3dd1c996778be ]
    
    While BCM5325/5365 do not support jumbo frames, they do support slightly
    oversized frames, so do not error out if requesting a supported MTU for
    them.
    
    Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support")
    Signed-off-by: Jonas Gorski <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: b53: fix jumbo frame mtu check [+ + +]

Author: Jonas Gorski <[email protected]>
Date:   Fri Oct 4 10:47:17 2024 +0200

    net: dsa: b53: fix jumbo frame mtu check
    
    [ Upstream commit 42fb3acf6826c6764ba79feb6e15229b43fd2f9f ]
    
    JMS_MIN_SIZE is the full ethernet frame length, while mtu is just the
    data payload size. Comparing these two meant that mtus between 1500 and
    1518 did not trigger enabling jumbo frames.
    
    So instead compare the set mtu ETH_DATA_LEN, which is equal to
    JMS_MIN_SIZE - ETH_HLEN - ETH_FCS_LEN;
    
    Also do a check that the requested mtu is actually greater than the
    minimum length, else we do not need to enable jumbo frames.
    
    In practice this only introduced a very small range of mtus that did not
    work properly. Newer chips allow 2000 byte large frames by default, and
    older chips allow 1536 bytes long, which is equivalent to an mtu of
    1514. So effectivly only mtus of 1515~1517 were broken.
    
    Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support")
    Signed-off-by: Jonas Gorski <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: b53: fix jumbo frames on 10/100 ports [+ + +]

Author: Jonas Gorski <[email protected]>
Date:   Fri Oct 4 10:47:21 2024 +0200

    net: dsa: b53: fix jumbo frames on 10/100 ports
    
    [ Upstream commit 2f3dcd0d39affe5b9ba1c351ce0e270c8bdd5109 ]
    
    All modern chips support and need the 10_100 bit set for supporting jumbo
    frames on 10/100 ports, so instead of enabling it only for 583XX enable
    it for everything except bcm63xx, where the bit is writeable, but does
    nothing.
    
    Tested on BCM53115, where jumbo frames were dropped at 10/100 speeds
    without the bit set.
    
    Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support")
    Signed-off-by: Jonas Gorski <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: b53: fix max MTU for 1g switches [+ + +]

Author: Jonas Gorski <[email protected]>
Date:   Fri Oct 4 10:47:18 2024 +0200

    net: dsa: b53: fix max MTU for 1g switches
    
    [ Upstream commit 680a8217dc00dc7e7da57888b3c053289b60eb2b ]
    
    JMS_MAX_SIZE is the ethernet frame length, not the MTU, which is payload
    without ethernet headers.
    
    According to the datasheets maximum supported frame length for most
    gigabyte swithes is 9720 bytes, so convert that to the expected MTU when
    using VLAN tagged frames.
    
    Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support")
    Signed-off-by: Jonas Gorski <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: b53: fix max MTU for BCM5325/BCM5365 [+ + +]

Author: Jonas Gorski <[email protected]>
Date:   Fri Oct 4 10:47:19 2024 +0200

    net: dsa: b53: fix max MTU for BCM5325/BCM5365
    
    [ Upstream commit ca8c1f71c10193c270f772d70d34b15ad765d6a8 ]
    
    BCM5325/BCM5365 do not support jumbo frames, so we should not report a
    jumbo frame mtu for them. But they do support so called "oversized"
    frames up to 1536 bytes long by default, so report an appropriate MTU.
    
    Fixes: 6ae5834b983a ("net: dsa: b53: add MTU configuration support")
    Signed-off-by: Jonas Gorski <[email protected]>
    Reviewed-by: Florian Fainelli <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: dsa: lan9303: ensure chip reset and wait for READY status [+ + +]

Author: Anatolij Gustschin <[email protected]>
Date:   Fri Oct 4 13:36:54 2024 +0200

    net: dsa: lan9303: ensure chip reset and wait for READY status
    
    commit 5c14e51d2d7df49fe0d4e64a12c58d2542f452ff upstream.
    
    Accessing device registers seems to be not reliable, the chip
    revision is sometimes detected wrongly (0 instead of expected 1).
    
    Ensure that the chip reset is performed via reset GPIO and then
    wait for 'Device Ready' status in HW_CFG register before doing
    any register initializations.
    
    Cc: [email protected]
    Fixes: a1292595e006 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
    Signed-off-by: Anatolij Gustschin <[email protected]>
    [alex: reworked using read_poll_timeout()]
    Signed-off-by: Alexander Sverdlin <[email protected]>
    Reviewed-by: Vladimir Oltean <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: ethernet: adi: adin1110: Fix some error handling path in adin1110_read_fifo() [+ + +]

Author: Christophe JAILLET <[email protected]>
Date:   Thu Oct 3 20:53:15 2024 +0200

    net: ethernet: adi: adin1110: Fix some error handling path in adin1110_read_fifo()
    
    [ Upstream commit 83211ae1640516accae645de82f5a0a142676897 ]
    
    If 'frame_size' is too small or if 'round_len' is an error code, it is
    likely that an error code should be returned to the caller.
    
    Actually, 'ret' is likely to be 0, so if one of these sanity checks fails,
    'success' is returned.
    
    Return -EINVAL instead.
    
    Fixes: bc93e19d088b ("net: ethernet: adi: Add ADIN1110 support")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Link: https://patch.msgid.link/8ff73b40f50d8fa994a454911b66adebce8da266.1727981562.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: cortina: Drop TSO support [+ + +]

Author: Linus Walleij <[email protected]>
Date:   Sat Jan 6 01:12:22 2024 +0100

    net: ethernet: cortina: Drop TSO support
    
    [ Upstream commit ac631873c9e7a50d2a8de457cfc4b9f86666403e ]
    
    The recent change to allow large frames without hardware checksumming
    slotted in software checksumming in the driver if hardware could not
    do it.
    
    This will however upset TSO (TCP Segment Offloading). Typical
    error dumps includes this:
    
    skb len=2961 headroom=222 headlen=66 tailroom=0
    (...)
    WARNING: CPU: 0 PID: 956 at net/core/dev.c:3259 skb_warn_bad_offload+0x7c/0x108
    gemini-ethernet-port: caps=(0x0000010000154813, 0x00002007ffdd7889)
    
    And the packets do not go through.
    
    The TSO implementation is bogus: a TSO enabled driver must propagate
    the skb_shinfo(skb)->gso_size value to the TSO engine on the NIC.
    
    Drop the size check and TSO offloading features for now: this
    needs to be fixed up properly.
    
    After this ethernet works fine on Gemini devices with a direct connected
    PHY such as D-Link DNS-313.
    
    Also tested to still be working with a DSA switch using the Gemini
    ethernet as conduit interface.
    
    Link: https://lore.kernel.org/netdev/CANn89iJLfxng1sYL5Zk0mknXpyYQPCp83m3KgD2KJ2_hKCpEUg@mail.gmail.com/
    Suggested-by: Eric Dumazet <[email protected]>
    Fixes: d4d0c5b4d279 ("net: ethernet: cortina: Handle large frames")
    Signed-off-by: Linus Walleij <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: ethernet: cortina: Restore TSO support [+ + +]

Author: Linus Walleij <[email protected]>
Date:   Mon May 27 21:26:44 2024 +0200

    net: ethernet: cortina: Restore TSO support
    
    commit 2942dfab630444d46aaa37fb7d629b620abbf6ba upstream.
    
    An earlier commit deleted the TSO support in the Cortina Gemini
    driver because the driver was confusing gso_size and MTU,
    probably because what the Linux kernel calls "gso_size" was
    called "MTU" in the datasheet.
    
    Restore the functionality properly reading the gso_size from
    the skbuff.
    
    Tested with iperf3, running a server on a different machine
    and client on the device with the cortina gemini ethernet:
    
    Connecting to host 192.168.1.2, port 5201
    60008000.ethernet-port eth0: segment offloading mss = 05ea len=1c8a
    60008000.ethernet-port eth0: segment offloading mss = 05ea len=1c8a
    60008000.ethernet-port eth0: segment offloading mss = 05ea len=27da
    60008000.ethernet-port eth0: segment offloading mss = 05ea len=0b92
    60008000.ethernet-port eth0: segment offloading mss = 05ea len=2bda
    (...)
    
    (The hardware MSS 0x05ea here includes the ethernet headers.)
    
    If I disable all segment offloading on the receiving host and
    dump packets using tcpdump -xx like this:
    
    ethtool -K enp2s0 gro off gso off tso off
    tcpdump -xx -i enp2s0 host 192.168.1.136
    
    I get segmented packages such as this when running iperf3:
    
    23:16:54.024139 IP OpenWrt.lan.59168 > Fecusia.targus-getdata1:
    Flags [.], seq 1486:2934, ack 1, win 4198,
    options [nop,nop,TS val 3886192908 ecr 3601341877], length 1448
    0x0000:  fc34 9701 a0c6 14d6 4da8 3c4f 0800 4500
    0x0010:  05dc 16a0 4000 4006 9aa1 c0a8 0188 c0a8
    0x0020:  0102 e720 1451 ff25 9822 4c52 29cf 8010
    0x0030:  1066 ac8c 0000 0101 080a e7a2 990c d6a8
    (...)
    0x05c0:  5e49 e109 fe8c 4617 5e18 7a82 7eae d647
    0x05d0:  e8ee ae64 dc88 c897 3f8a 07a4 3a33 6b1b
    0x05e0:  3501 a30f 2758 cc44 4b4a
    
    Several such packets often follow after each other verifying
    the segmentation into 0x05a8 (1448) byte packages also on the
    reveiving end. As can be seen, the ethernet frames are
    0x05ea (1514) in size.
    
    Performance with iperf3 before this patch: ~15.5 Mbit/s
    Performance with iperf3 after this patch: ~175 Mbit/s
    
    This was running a 60 second test (twice) the best measurement
    was 179 Mbit/s.
    
    For comparison if I run iperf3 with UDP I get around 1.05 Mbit/s
    both before and after this patch.
    
    While this is a gigabit ethernet interface, the CPU is a cheap
    D-Link DIR-685 router (based on the ARMv5 Faraday FA526 at
    ~50 MHz), and the software is not supposed to drive traffic,
    as the device has a DSA chip, so this kind of numbers can be
    expected.
    
    Fixes: ac631873c9e7 ("net: ethernet: cortina: Drop TSO support")
    Reviewed-by: Eric Dumazet <[email protected]>
    Signed-off-by: Linus Walleij <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: explicitly clear the sk pointer, when pf->create fails [+ + +]

Author: Ignat Korchagin <[email protected]>
Date:   Thu Oct 3 18:01:51 2024 +0100

    net: explicitly clear the sk pointer, when pf->create fails
    
    commit 631083143315d1b192bd7d915b967b37819e88ea upstream.
    
    We have recently noticed the exact same KASAN splat as in commit
    6cd4a78d962b ("net: do not leave a dangling sk pointer, when socket
    creation fails"). The problem is that commit did not fully address the
    problem, as some pf->create implementations do not use sk_common_release
    in their error paths.
    
    For example, we can use the same reproducer as in the above commit, but
    changing ping to arping. arping uses AF_PACKET socket and if packet_create
    fails, it will just sk_free the allocated sk object.
    
    While we could chase all the pf->create implementations and make sure they
    NULL the freed sk object on error from the socket, we can't guarantee
    future protocols will not make the same mistake.
    
    So it is easier to just explicitly NULL the sk pointer upon return from
    pf->create in __sock_create. We do know that pf->create always releases the
    allocated sk object on error, so if the pointer is not NULL, it is
    definitely dangling.
    
    Fixes: 6cd4a78d962b ("net: do not leave a dangling sk pointer, when socket creation fails")
    Signed-off-by: Ignat Korchagin <[email protected]>
    Cc: [email protected]
    Reviewed-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: fec: don't save PTP state if PTP is unsupported [+ + +]

Author: Wei Fang <[email protected]>
Date:   Tue Oct 8 14:11:53 2024 +0800

    net: fec: don't save PTP state if PTP is unsupported
    
    commit 6be063071a457767ee229db13f019c2ec03bfe44 upstream.
    
    Some platforms (such as i.MX25 and i.MX27) do not support PTP, so on
    these platforms fec_ptp_init() is not called and the related members
    in fep are not initialized. However, fec_ptp_save_state() is called
    unconditionally, which causes the kernel to panic. Therefore, add a
    condition so that fec_ptp_save_state() is not called if PTP is not
    supported.
    
    Fixes: a1477dc87dc4 ("net: fec: Restart PPS after link state change")
    Reported-by: Guenter Roeck <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Signed-off-by: Wei Fang <[email protected]>
    Reviewed-by: Csókás, Bence <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Tested-by: Guenter Roeck <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: Fix an unsafe loop on the list [+ + +]

Author: Anastasia Kovaleva <[email protected]>
Date:   Thu Oct 3 13:44:31 2024 +0300

    net: Fix an unsafe loop on the list
    
    commit 1dae9f1187189bc09ff6d25ca97ead711f7e26f9 upstream.
    
    The kernel may crash when deleting a genetlink family if there are still
    listeners for that family:
    
    Oops: Kernel access of bad area, sig: 11 [#1]
      ...
      NIP [c000000000c080bc] netlink_update_socket_mc+0x3c/0xc0
      LR [c000000000c0f764] __netlink_clear_multicast_users+0x74/0xc0
      Call Trace:
    __netlink_clear_multicast_users+0x74/0xc0
    genl_unregister_family+0xd4/0x2d0
    
    Change the unsafe loop on the list to a safe one, because inside the
    loop there is an element removal from this list.
    
    Fixes: b8273570f802 ("genetlink: fix netns vs. netlink table locking (2)")
    Cc: [email protected]
    Signed-off-by: Anastasia Kovaleva <[email protected]>
    Reviewed-by: Dmitry Bogdanov <[email protected]>
    Reviewed-by: Kuniyuki Iwashima <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

net: ibm: emac: mal: fix wrong goto [+ + +]

Author: Rosen Penev <[email protected]>
Date:   Mon Oct 7 16:57:11 2024 -0700

    net: ibm: emac: mal: fix wrong goto
    
    [ Upstream commit 08c8acc9d8f3f70d62dd928571368d5018206490 ]
    
    dcr_map is called in the previous if and therefore needs to be unmapped.
    
    Fixes: 1ff0fcfcb1a6 ("ibm_newemac: Fix new MAL feature handling")
    Signed-off-by: Rosen Penev <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: phy: bcm84881: Fix some error handling paths [+ + +]

Author: Christophe JAILLET <[email protected]>
Date:   Thu Oct 3 21:03:21 2024 +0200

    net: phy: bcm84881: Fix some error handling paths
    
    [ Upstream commit 9234a2549cb6ac038bec36cc7c084218e9575513 ]
    
    If phy_read_mmd() fails, the error code stored in 'bmsr' should be returned
    instead of 'val' which is likely to be 0.
    
    Fixes: 75f4d8d10e01 ("net: phy: add Broadcom BCM84881 PHY driver")
    Signed-off-by: Christophe JAILLET <[email protected]>
    Link: https://patch.msgid.link/3e1755b0c40340d00e089d6adae5bca2f8c79e53.1727982168.git.christophe.jaillet@wanadoo.fr
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: phy: dp83869: fix memory corruption when enabling fiber [+ + +]

Author: Ingo van Lil <[email protected]>
Date:   Wed Oct 2 18:18:07 2024 +0200

    net: phy: dp83869: fix memory corruption when enabling fiber
    
    [ Upstream commit a842e443ca8184f2dc82ab307b43a8b38defd6a5 ]
    
    When configuring the fiber port, the DP83869 PHY driver incorrectly
    calls linkmode_set_bit() with a bit mask (1 << 10) rather than a bit
    number (10). This corrupts some other memory location -- in case of
    arm64 the priv pointer in the same structure.
    
    Since the advertising flags are updated from supported at the end of the
    function the incorrect line isn't needed at all and can be removed.
    
    Fixes: a29de52ba2a1 ("net: dp83869: Add ability to advertise Fiber connection")
    Signed-off-by: Ingo van Lil <[email protected]>
    Reviewed-by: Alexander Sverdlin <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

net: phy: Remove LED entry from LEDs list on unregister [+ + +]

Author: Christian Marangi <[email protected]>
Date:   Fri Oct 4 20:27:58 2024 +0200

    net: phy: Remove LED entry from LEDs list on unregister
    
    commit f50b5d74c68e551667e265123659b187a30fe3a5 upstream.
    
    Commit c938ab4da0eb ("net: phy: Manual remove LEDs to ensure correct
    ordering") correctly fixed a problem with using devm_ but missed
    removing the LED entry from the LEDs list.
    
    This cause kernel panic on specific scenario where the port for the PHY
    is torn down and up and the kmod for the PHY is removed.
    
    On setting the port down the first time, the assosiacted LEDs are
    correctly unregistered. The associated kmod for the PHY is now removed.
    The kmod is now added again and the port is now put up, the associated LED
    are registered again.
    On putting the port down again for the second time after these step, the
    LED list now have 4 elements. With the first 2 already unregistered
    previously and the 2 new one registered again.
    
    This cause a kernel panic as the first 2 element should have been
    removed.
    
    Fix this by correctly removing the element when LED is unregistered.
    
    Reported-by: Daniel Golle <[email protected]>
    Tested-by: Daniel Golle <[email protected]>
    Cc: [email protected]
    Fixes: c938ab4da0eb ("net: phy: Manual remove LEDs to ensure correct ordering")
    Signed-off-by: Christian Marangi <[email protected]>
    Reviewed-by: Andrew Lunn <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

netfilter: br_netfilter: fix panic with metadata_dst skb [+ + +]

Author: Andy Roulin <[email protected]>
Date:   Tue Oct 1 08:43:59 2024 -0700

    netfilter: br_netfilter: fix panic with metadata_dst skb
    
    [ Upstream commit f9ff7665cd128012868098bbd07e28993e314fdb ]
    
    Fix a kernel panic in the br_netfilter module when sending untagged
    traffic via a VxLAN device.
    This happens during the check for fragmentation in br_nf_dev_queue_xmit.
    
    It is dependent on:
    1) the br_netfilter module being loaded;
    2) net.bridge.bridge-nf-call-iptables set to 1;
    3) a bridge with a VxLAN (single-vxlan-device) netdevice as a bridge port;
    4) untagged frames with size higher than the VxLAN MTU forwarded/flooded
    
    When forwarding the untagged packet to the VxLAN bridge port, before
    the netfilter hooks are called, br_handle_egress_vlan_tunnel is called and
    changes the skb_dst to the tunnel dst. The tunnel_dst is a metadata type
    of dst, i.e., skb_valid_dst(skb) is false, and metadata->dst.dev is NULL.
    
    Then in the br_netfilter hooks, in br_nf_dev_queue_xmit, there's a check
    for frames that needs to be fragmented: frames with higher MTU than the
    VxLAN device end up calling br_nf_ip_fragment, which in turns call
    ip_skb_dst_mtu.
    
    The ip_dst_mtu tries to use the skb_dst(skb) as if it was a valid dst
    with valid dst->dev, thus the crash.
    
    This case was never supported in the first place, so drop the packet
    instead.
    
    PING 10.0.0.2 (10.0.0.2) from 0.0.0.0 h1-eth0: 2000(2028) bytes of data.
    [  176.291791] Unable to handle kernel NULL pointer dereference at
    virtual address 0000000000000110
    [  176.292101] Mem abort info:
    [  176.292184]   ESR = 0x0000000096000004
    [  176.292322]   EC = 0x25: DABT (current EL), IL = 32 bits
    [  176.292530]   SET = 0, FnV = 0
    [  176.292709]   EA = 0, S1PTW = 0
    [  176.292862]   FSC = 0x04: level 0 translation fault
    [  176.293013] Data abort info:
    [  176.293104]   ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
    [  176.293488]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
    [  176.293787]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
    [  176.293995] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000043ef5000
    [  176.294166] [0000000000000110] pgd=0000000000000000,
    p4d=0000000000000000
    [  176.294827] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
    [  176.295252] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel veth
    br_netfilter bridge stp llc ipv6 crct10dif_ce
    [  176.295923] CPU: 0 PID: 188 Comm: ping Not tainted
    6.8.0-rc3-g5b3fbd61b9d1 #2
    [  176.296314] Hardware name: linux,dummy-virt (DT)
    [  176.296535] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS
    BTYPE=--)
    [  176.296808] pc : br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter]
    [  176.297382] lr : br_nf_dev_queue_xmit+0x2ac/0x4ec [br_netfilter]
    [  176.297636] sp : ffff800080003630
    [  176.297743] x29: ffff800080003630 x28: 0000000000000008 x27:
    ffff6828c49ad9f8
    [  176.298093] x26: ffff6828c49ad000 x25: 0000000000000000 x24:
    00000000000003e8
    [  176.298430] x23: 0000000000000000 x22: ffff6828c4960b40 x21:
    ffff6828c3b16d28
    [  176.298652] x20: ffff6828c3167048 x19: ffff6828c3b16d00 x18:
    0000000000000014
    [  176.298926] x17: ffffb0476322f000 x16: ffffb7e164023730 x15:
    0000000095744632
    [  176.299296] x14: ffff6828c3f1c880 x13: 0000000000000002 x12:
    ffffb7e137926a70
    [  176.299574] x11: 0000000000000001 x10: ffff6828c3f1c898 x9 :
    0000000000000000
    [  176.300049] x8 : ffff6828c49bf070 x7 : 0008460f18d5f20e x6 :
    f20e0100bebafeca
    [  176.300302] x5 : ffff6828c7f918fe x4 : ffff6828c49bf070 x3 :
    0000000000000000
    [  176.300586] x2 : 0000000000000000 x1 : ffff6828c3c7ad00 x0 :
    ffff6828c7f918f0
    [  176.300889] Call trace:
    [  176.301123]  br_nf_dev_queue_xmit+0x390/0x4ec [br_netfilter]
    [  176.301411]  br_nf_post_routing+0x2a8/0x3e4 [br_netfilter]
    [  176.301703]  nf_hook_slow+0x48/0x124
    [  176.302060]  br_forward_finish+0xc8/0xe8 [bridge]
    [  176.302371]  br_nf_hook_thresh+0x124/0x134 [br_netfilter]
    [  176.302605]  br_nf_forward_finish+0x118/0x22c [br_netfilter]
    [  176.302824]  br_nf_forward_ip.part.0+0x264/0x290 [br_netfilter]
    [  176.303136]  br_nf_forward+0x2b8/0x4e0 [br_netfilter]
    [  176.303359]  nf_hook_slow+0x48/0x124
    [  176.303803]  __br_forward+0xc4/0x194 [bridge]
    [  176.304013]  br_flood+0xd4/0x168 [bridge]
    [  176.304300]  br_handle_frame_finish+0x1d4/0x5c4 [bridge]
    [  176.304536]  br_nf_hook_thresh+0x124/0x134 [br_netfilter]
    [  176.304978]  br_nf_pre_routing_finish+0x29c/0x494 [br_netfilter]
    [  176.305188]  br_nf_pre_routing+0x250/0x524 [br_netfilter]
    [  176.305428]  br_handle_frame+0x244/0x3cc [bridge]
    [  176.305695]  __netif_receive_skb_core.constprop.0+0x33c/0xecc
    [  176.306080]  __netif_receive_skb_one_core+0x40/0x8c
    [  176.306197]  __netif_receive_skb+0x18/0x64
    [  176.306369]  process_backlog+0x80/0x124
    [  176.306540]  __napi_poll+0x38/0x17c
    [  176.306636]  net_rx_action+0x124/0x26c
    [  176.306758]  __do_softirq+0x100/0x26c
    [  176.307051]  ____do_softirq+0x10/0x1c
    [  176.307162]  call_on_irq_stack+0x24/0x4c
    [  176.307289]  do_softirq_own_stack+0x1c/0x2c
    [  176.307396]  do_softirq+0x54/0x6c
    [  176.307485]  __local_bh_enable_ip+0x8c/0x98
    [  176.307637]  __dev_queue_xmit+0x22c/0xd28
    [  176.307775]  neigh_resolve_output+0xf4/0x1a0
    [  176.308018]  ip_finish_output2+0x1c8/0x628
    [  176.308137]  ip_do_fragment+0x5b4/0x658
    [  176.308279]  ip_fragment.constprop.0+0x48/0xec
    [  176.308420]  __ip_finish_output+0xa4/0x254
    [  176.308593]  ip_finish_output+0x34/0x130
    [  176.308814]  ip_output+0x6c/0x108
    [  176.308929]  ip_send_skb+0x50/0xf0
    [  176.309095]  ip_push_pending_frames+0x30/0x54
    [  176.309254]  raw_sendmsg+0x758/0xaec
    [  176.309568]  inet_sendmsg+0x44/0x70
    [  176.309667]  __sys_sendto+0x110/0x178
    [  176.309758]  __arm64_sys_sendto+0x28/0x38
    [  176.309918]  invoke_syscall+0x48/0x110
    [  176.310211]  el0_svc_common.constprop.0+0x40/0xe0
    [  176.310353]  do_el0_svc+0x1c/0x28
    [  176.310434]  el0_svc+0x34/0xb4
    [  176.310551]  el0t_64_sync_handler+0x120/0x12c
    [  176.310690]  el0t_64_sync+0x190/0x194
    [  176.311066] Code: f9402e61 79402aa2 927ff821 f9400023 (f9408860)
    [  176.315743] ---[ end trace 0000000000000000 ]---
    [  176.316060] Kernel panic - not syncing: Oops: Fatal exception in
    interrupt
    [  176.316371] Kernel Offset: 0x37e0e3000000 from 0xffff800080000000
    [  176.316564] PHYS_OFFSET: 0xffff97d780000000
    [  176.316782] CPU features: 0x0,88000203,3c020000,0100421b
    [  176.317210] Memory Limit: none
    [  176.317527] ---[ end Kernel panic - not syncing: Oops: Fatal
    Exception in interrupt ]---\
    
    Fixes: 11538d039ac6 ("bridge: vlan dst_metadata hooks in ingress and egress paths")
    Reviewed-by: Ido Schimmel <[email protected]>
    Signed-off-by: Andy Roulin <[email protected]>
    Acked-by: Nikolay Aleksandrov <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: fib: check correct rtable in vrf setups [+ + +]

Author: Florian Westphal <[email protected]>
Date:   Wed Oct 9 09:19:02 2024 +0200

    netfilter: fib: check correct rtable in vrf setups
    
    [ Upstream commit 05ef7055debc804e8083737402127975e7244fc4 ]
    
    We need to init l3mdev unconditionally, else main routing table is searched
    and incorrect result is returned unless strict (iif keyword) matching is
    requested.
    
    Next patch adds a selftest for this.
    
    Fixes: 2a8a7c0eaa87 ("netfilter: nft_fib: Fix for rpath check with VRF devices")
    Closes: https://bugzilla.netfilter.org/show_bug.cgi?id=1761
    Signed-off-by: Florian Westphal <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash [+ + +]

Author: Florian Westphal <[email protected]>
Date:   Tue Sep 10 11:38:14 2024 +0200

    netfilter: nf_nat: don't try nat source port reallocation for reverse dir clash
    
    [ Upstream commit d8f84a9bc7c4e07fdc4edc00f9e868b8db974ccb ]
    
    A conntrack entry can be inserted to the connection tracking table if there
    is no existing entry with an identical tuple in either direction.
    
    Example:
    INITIATOR -> NAT/PAT -> RESPONDER
    
    Initiator passes through NAT/PAT ("us") and SNAT is done (saddr rewrite).
    Then, later, NAT/PAT machine itself also wants to connect to RESPONDER.
    
    This will not work if the SNAT done earlier has same IP:PORT source pair.
    
    Conntrack table has:
    ORIGINAL: $IP_INITATOR:$SPORT -> $IP_RESPONDER:$DPORT
    REPLY:    $IP_RESPONDER:$DPORT -> $IP_NAT:$SPORT
    
    and new locally originating connection wants:
    ORIGINAL: $IP_NAT:$SPORT -> $IP_RESPONDER:$DPORT
    REPLY:    $IP_RESPONDER:$DPORT -> $IP_NAT:$SPORT
    
    This is handled by the NAT engine which will do a source port reallocation
    for the locally originating connection that is colliding with an existing
    tuple by attempting a source port rewrite.
    
    This is done even if this new connection attempt did not go through a
    masquerade/snat rule.
    
    There is a rare race condition with connection-less protocols like UDP,
    where we do the port reallocation even though its not needed.
    
    This happens when new packets from the same, pre-existing flow are received
    in both directions at the exact same time on different CPUs after the
    conntrack table was flushed (or conntrack becomes active for first time).
    
    With strict ordering/single cpu, the first packet creates new ct entry and
    second packet is resolved as established reply packet.
    
    With parallel processing, both packets are picked up as new and both get
    their own ct entry.
    
    In this case, the 'reply' packet (picked up as ORIGINAL) can be mangled by
    NAT engine because a port collision is detected.
    
    This change isn't enough to prevent a packet drop later during
    nf_conntrack_confirm(), the existing clash resolution strategy will not
    detect such reverse clash case.  This is resolved by a followup patch.
    
    Signed-off-by: Florian Westphal <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: nf_reject: Fix build warning when CONFIG_BRIDGE_NETFILTER=n [+ + +]

Author: Simon Horman <[email protected]>
Date:   Mon Sep 16 10:50:34 2024 +0100

    netfilter: nf_reject: Fix build warning when CONFIG_BRIDGE_NETFILTER=n
    
    [ Upstream commit fc56878ca1c288e49b5cbb43860a5938e3463654 ]
    
    If CONFIG_BRIDGE_NETFILTER is not enabled, which is the case for x86_64
    defconfig, then building nf_reject_ipv4.c and nf_reject_ipv6.c with W=1
    using gcc-14 results in the following warnings, which are treated as
    errors:
    
    net/ipv4/netfilter/nf_reject_ipv4.c: In function 'nf_send_reset':
    net/ipv4/netfilter/nf_reject_ipv4.c:243:23: error: variable 'niph' set but not used [-Werror=unused-but-set-variable]
      243 |         struct iphdr *niph;
          |                       ^~~~
    cc1: all warnings being treated as errors
    net/ipv6/netfilter/nf_reject_ipv6.c: In function 'nf_send_reset6':
    net/ipv6/netfilter/nf_reject_ipv6.c:286:25: error: variable 'ip6h' set but not used [-Werror=unused-but-set-variable]
      286 |         struct ipv6hdr *ip6h;
          |                         ^~~~
    cc1: all warnings being treated as errors
    
    Address this by reducing the scope of these local variables to where
    they are used, which is code only compiled when CONFIG_BRIDGE_NETFILTER
    enabled.
    
    Compile tested and run through netfilter selftests.
    
    Reported-by: Andy Shevchenko <[email protected]>
    Closes: https://lore.kernel.org/netfilter-devel/[email protected]/
    Signed-off-by: Simon Horman <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

netfilter: xtables: avoid NFPROTO_UNSPEC where needed [+ + +]

Author: Florian Westphal <[email protected]>
Date:   Mon Oct 7 11:28:16 2024 +0200

    netfilter: xtables: avoid NFPROTO_UNSPEC where needed
    
    [ Upstream commit 0bfcb7b71e735560077a42847f69597ec7dcc326 ]
    
    syzbot managed to call xt_cluster match via ebtables:
    
     WARNING: CPU: 0 PID: 11 at net/netfilter/xt_cluster.c:72 xt_cluster_mt+0x196/0x780
     [..]
     ebt_do_table+0x174b/0x2a40
    
    Module registers to NFPROTO_UNSPEC, but it assumes ipv4/ipv6 packet
    processing.  As this is only useful to restrict locally terminating
    TCP/UDP traffic, register this for ipv4 and ipv6 family only.
    
    Pablo points out that this is a general issue, direct users of the
    set/getsockopt interface can call into targets/matches that were only
    intended for use with ip(6)tables.
    
    Check all UNSPEC matches and targets for similar issues:
    
    - matches and targets are fine except if they assume skb_network_header()
      is valid -- this is only true when called from inet layer: ip(6) stack
      pulls the ip/ipv6 header into linear data area.
    - targets that return XT_CONTINUE or other xtables verdicts must be
      restricted too, they are incompatbile with the ebtables traverser, e.g.
      EBT_CONTINUE is a completely different value than XT_CONTINUE.
    
    Most matches/targets are changed to register for NFPROTO_IPV4/IPV6, as
    they are provided for use by ip(6)tables.
    
    The MARK target is also used by arptables, so register for NFPROTO_ARP too.
    
    While at it, bail out if connbytes fails to enable the corresponding
    conntrack family.
    
    This change passes the selftests in iptables.git.
    
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/netfilter-devel/[email protected]/
    Fixes: 0269ea493734 ("netfilter: xtables: add cluster match")
    Signed-off-by: Florian Westphal <[email protected]>
    Co-developed-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Pablo Neira Ayuso <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

NFSD: Mark filecache "down" if init fails [+ + +]

Author: Chuck Lever <[email protected]>
Date:   Sat Sep 21 14:25:37 2024 -0400

    NFSD: Mark filecache "down" if init fails
    
    [ Upstream commit dc0d0f885aa422f621bc1c2124133eff566b0bc8 ]
    
    NeilBrown says:
    > The handling of NFSD_FILE_CACHE_UP is strange.  nfsd_file_cache_init()
    > sets it, but doesn't clear it on failure.  So if nfsd_file_cache_init()
    > fails for some reason, nfsd_file_cache_shutdown() would still try to
    > clean up if it was called.
    
    Reported-by: NeilBrown <[email protected]>
    Fixes: c7b824c3d06c ("NFSD: Replace the "init once" mechanism")
    Signed-off-by: Chuck Lever <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies() [+ + +]

Author: Yanjun Zhang <[email protected]>
Date:   Tue Oct 1 16:39:30 2024 +0800

    NFSv4: Prevent NULL-pointer dereference in nfs42_complete_copies()
    
    [ Upstream commit a848c29e3486189aaabd5663bc11aea50c5bd144 ]
    
    On the node of an NFS client, some files saved in the mountpoint of the
    NFS server were copied to another location of the same NFS server.
    Accidentally, the nfs42_complete_copies() got a NULL-pointer dereference
    crash with the following syslog:
    
    [232064.838881] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116
    [232064.839360] NFSv4: state recovery failed for open file nfs/pvc-12b5200d-cd0f-46a3-b9f0-af8f4fe0ef64.qcow2, error = -116
    [232066.588183] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058
    [232066.588586] Mem abort info:
    [232066.588701]   ESR = 0x0000000096000007
    [232066.588862]   EC = 0x25: DABT (current EL), IL = 32 bits
    [232066.589084]   SET = 0, FnV = 0
    [232066.589216]   EA = 0, S1PTW = 0
    [232066.589340]   FSC = 0x07: level 3 translation fault
    [232066.589559] Data abort info:
    [232066.589683]   ISV = 0, ISS = 0x00000007
    [232066.589842]   CM = 0, WnR = 0
    [232066.589967] user pgtable: 64k pages, 48-bit VAs, pgdp=00002000956ff400
    [232066.590231] [0000000000000058] pgd=08001100ae100003, p4d=08001100ae100003, pud=08001100ae100003, pmd=08001100b3c00003, pte=0000000000000000
    [232066.590757] Internal error: Oops: 96000007 [#1] SMP
    [232066.590958] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm vhost_net vhost vhost_iotlb tap tun ipt_rpfilter xt_multiport ip_set_hash_ip ip_set_hash_net xfrm_interface xfrm6_tunnel tunnel4 tunnel6 esp4 ah4 wireguard libcurve25519_generic veth xt_addrtype xt_set nf_conntrack_netlink ip_set_hash_ipportnet ip_set_hash_ipportip ip_set_bitmap_port ip_set_hash_ipport dummy ip_set ip_vs_sh ip_vs_wrr ip_vs_rr ip_vs iptable_filter sch_ingress nfnetlink_cttimeout vport_gre ip_gre ip_tunnel gre vport_geneve geneve vport_vxlan vxlan ip6_udp_tunnel udp_tunnel openvswitch nf_conncount dm_round_robin dm_service_time dm_multipath xt_nat xt_MASQUERADE nft_chain_nat nf_nat xt_mark xt_conntrack xt_comment nft_compat nft_counter nf_tables nfnetlink ocfs2 ocfs2_nodemanager ocfs2_stackglue iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_ssif nbd overlay 8021q garp mrp bonding tls rfkill sunrpc ext4 mbcache jbd2
    [232066.591052]  vfat fat cas_cache cas_disk ses enclosure scsi_transport_sas sg acpi_ipmi ipmi_si ipmi_devintf ipmi_msghandler ip_tables vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio dm_mirror dm_region_hash dm_log dm_mod nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc fuse xfs libcrc32c ast drm_vram_helper qla2xxx drm_kms_helper syscopyarea crct10dif_ce sysfillrect ghash_ce sysimgblt sha2_ce fb_sys_fops cec sha256_arm64 sha1_ce drm_ttm_helper ttm nvme_fc igb sbsa_gwdt nvme_fabrics drm nvme_core i2c_algo_bit i40e scsi_transport_fc megaraid_sas aes_neon_bs
    [232066.596953] CPU: 6 PID: 4124696 Comm: 10.253.166.125- Kdump: loaded Not tainted 5.15.131-9.cl9_ocfs2.aarch64 #1
    [232066.597356] Hardware name: Great Wall .\x93\x8e...RF6260 V5/GWMSSE2GL1T, BIOS T656FBE_V3.0.18 2024-01-06
    [232066.597721] pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
    [232066.598034] pc : nfs4_reclaim_open_state+0x220/0x800 [nfsv4]
    [232066.598327] lr : nfs4_reclaim_open_state+0x12c/0x800 [nfsv4]
    [232066.598595] sp : ffff8000f568fc70
    [232066.598731] x29: ffff8000f568fc70 x28: 0000000000001000 x27: ffff21003db33000
    [232066.599030] x26: ffff800005521ae0 x25: ffff0100f98fa3f0 x24: 0000000000000001
    [232066.599319] x23: ffff800009920008 x22: ffff21003db33040 x21: ffff21003db33050
    [232066.599628] x20: ffff410172fe9e40 x19: ffff410172fe9e00 x18: 0000000000000000
    [232066.599914] x17: 0000000000000000 x16: 0000000000000004 x15: 0000000000000000
    [232066.600195] x14: 0000000000000000 x13: ffff800008e685a8 x12: 00000000eac0c6e6
    [232066.600498] x11: 0000000000000000 x10: 0000000000000008 x9 : ffff8000054e5828
    [232066.600784] x8 : 00000000ffffffbf x7 : 0000000000000001 x6 : 000000000a9eb14a
    [232066.601062] x5 : 0000000000000000 x4 : ffff70ff8a14a800 x3 : 0000000000000058
    [232066.601348] x2 : 0000000000000001 x1 : 54dce46366daa6c6 x0 : 0000000000000000
    [232066.601636] Call trace:
    [232066.601749]  nfs4_reclaim_open_state+0x220/0x800 [nfsv4]
    [232066.601998]  nfs4_do_reclaim+0x1b8/0x28c [nfsv4]
    [232066.602218]  nfs4_state_manager+0x928/0x10f0 [nfsv4]
    [232066.602455]  nfs4_run_state_manager+0x78/0x1b0 [nfsv4]
    [232066.602690]  kthread+0x110/0x114
    [232066.602830]  ret_from_fork+0x10/0x20
    [232066.602985] Code: 1400000d f9403f20 f9402e61 91016003 (f9402c00)
    [232066.603284] SMP: stopping secondary CPUs
    [232066.606936] Starting crashdump kernel...
    [232066.607146] Bye!
    
    Analysing the vmcore, we know that nfs4_copy_state listed by destination
    nfs_server->ss_copies was added by the field copies in handle_async_copy(),
    and we found a waiting copy process with the stack as:
    PID: 3511963  TASK: ffff710028b47e00  CPU: 0   COMMAND: "cp"
     #0 [ffff8001116ef740] __switch_to at ffff8000081b92f4
     #1 [ffff8001116ef760] __schedule at ffff800008dd0650
     #2 [ffff8001116ef7c0] schedule at ffff800008dd0a00
     #3 [ffff8001116ef7e0] schedule_timeout at ffff800008dd6aa0
     #4 [ffff8001116ef860] __wait_for_common at ffff800008dd166c
     #5 [ffff8001116ef8e0] wait_for_completion_interruptible at ffff800008dd1898
     #6 [ffff8001116ef8f0] handle_async_copy at ffff8000055142f4 [nfsv4]
     #7 [ffff8001116ef970] _nfs42_proc_copy at ffff8000055147c8 [nfsv4]
     #8 [ffff8001116efa80] nfs42_proc_copy at ffff800005514cf0 [nfsv4]
     #9 [ffff8001116efc50] __nfs4_copy_file_range.constprop.0 at ffff8000054ed694 [nfsv4]
    
    The NULL-pointer dereference was due to nfs42_complete_copies() listed
    the nfs_server->ss_copies by the field ss_copies of nfs4_copy_state.
    So the nfs4_copy_state address ffff0100f98fa3f0 was offset by 0x10 and
    the data accessed through this pointer was also incorrect. Generally,
    the ordered list nfs4_state_owner->so_states indicate open(O_RDWR) or
    open(O_WRITE) states are reclaimed firstly by nfs4_reclaim_open_state().
    When destination state reclaim is failed with NFS_STATE_RECOVERY_FAILED
    and copies are not deleted in nfs_server->ss_copies, the source state
    may be passed to the nfs42_complete_copies() process earlier, resulting
    in this crash scene finally. To solve this issue, we add a list_head
    nfs_server->ss_src_copies for a server-to-server copy specially.
    
    Fixes: 0e65a32c8a56 ("NFS: handle source server reboot")
    Signed-off-by: Yanjun Zhang <[email protected]>
    Reviewed-by: Trond Myklebust <[email protected]>
    Signed-off-by: Anna Schumaker <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

nouveau/dmem: Fix privileged error in copy engine channel [+ + +]

Author: Yonatan Maman <[email protected]>
Date:   Tue Oct 8 14:59:42 2024 +0300

    nouveau/dmem: Fix privileged error in copy engine channel
    
    [ Upstream commit 04e0481526e30ab8c7e7580033d2f88b7ef2da3f ]
    
    When `nouveau_dmem_copy_one` is called, the following error occurs:
    
    [272146.675156] nouveau 0000:06:00.0: fifo: PBDMA9: 00000004 [HCE_PRIV]
    ch 1 00000300 00003386
    
    This indicates that a copy push command triggered a Host Copy Engine
    Privileged error on channel 1 (Copy Engine channel). To address this
    issue, modify the Copy Engine channel to allow privileged push commands
    
    Fixes: 6de125383a5c ("drm/nouveau/fifo: expose runlist topology info on all chipsets")
    Signed-off-by: Yonatan Maman <[email protected]>
    Co-developed-by: Gal Shalom <[email protected]>
    Signed-off-by: Gal Shalom <[email protected]>
    Reviewed-by: Ben Skeggs <[email protected]>
    Signed-off-by: Danilo Krummrich <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

nouveau/dmem: Fix vulnerability in migrate_to_ram upon copy error [+ + +]

Author: Yonatan Maman <[email protected]>
Date:   Tue Oct 8 14:59:43 2024 +0300

    nouveau/dmem: Fix vulnerability in migrate_to_ram upon copy error
    
    commit 835745a377a4519decd1a36d6b926e369b3033e2 upstream.
    
    The `nouveau_dmem_copy_one` function ensures that the copy push command is
    sent to the device firmware but does not track whether it was executed
    successfully.
    
    In the case of a copy error (e.g., firmware or hardware failure), the
    copy push command will be sent via the firmware channel, and
    `nouveau_dmem_copy_one` will likely report success, leading to the
    `migrate_to_ram` function returning a dirty HIGH_USER page to the user.
    
    This can result in a security vulnerability, as a HIGH_USER page that may
    contain sensitive or corrupted data could be returned to the user.
    
    To prevent this vulnerability, we allocate a zero page. Thus, in case of
    an error, a non-dirty (zero) page will be returned to the user.
    
    Fixes: 5be73b690875 ("drm/nouveau/dmem: device memory helpers for SVM")
    Signed-off-by: Yonatan Maman <[email protected]>
    Co-developed-by: Gal Shalom <[email protected]>
    Signed-off-by: Gal Shalom <[email protected]>
    Reviewed-by: Ben Skeggs <[email protected]>
    Cc: [email protected]
    Signed-off-by: Danilo Krummrich <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

ntb: ntb_hw_switchtec: Fix use after free vulnerability in switchtec_ntb_remove due to race condition [+ + +]

Author: Kaixin Wang <[email protected]>
Date:   Tue Sep 10 01:20:07 2024 +0800

    ntb: ntb_hw_switchtec: Fix use after free vulnerability in switchtec_ntb_remove due to race condition
    
    [ Upstream commit e51aded92d42784313ba16c12f4f88cc4f973bbb ]
    
    In the switchtec_ntb_add function, it can call switchtec_ntb_init_sndev
    function, then &sndev->check_link_status_work is bound with
    check_link_status_work. switchtec_ntb_link_notification may be called
    to start the work.
    
    If we remove the module which will call switchtec_ntb_remove to make
    cleanup, it will free sndev through kfree(sndev), while the work
    mentioned above will be used. The sequence of operations that may lead
    to a UAF bug is as follows:
    
    CPU0                                 CPU1
    
                            | check_link_status_work
    switchtec_ntb_remove    |
    kfree(sndev);           |
                            | if (sndev->link_force_down)
                            | // use sndev
    
    Fix it by ensuring that the work is canceled before proceeding with
    the cleanup in switchtec_ntb_remove.
    
    Signed-off-by: Kaixin Wang <[email protected]>
    Reviewed-by: Logan Gunthorpe <[email protected]>
    Signed-off-by: Jon Mason <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ntfs3: Change to non-blocking allocation in ntfs_d_hash [+ + +]

Author: Diogo Jahchan Koike <[email protected]>
Date:   Mon Sep 2 14:19:32 2024 -0300

    ntfs3: Change to non-blocking allocation in ntfs_d_hash
    
    [ Upstream commit 589996bf8c459deb5bbc9747d8f1c51658608103 ]
    
    d_hash is done while under "rcu-walk" and should not sleep.
    __get_name() allocates using GFP_KERNEL, having the possibility
    to sleep when under memory pressure. Change the allocation to
    GFP_NOWAIT.
    
    Reported-by: [email protected]
    Closes: https://syzkaller.appspot.com/bug?extid=7f71f79bbfb4427b00e1
    Fixes: d392e85fd1e8 ("fs/ntfs3: Fix the format of the "nocase" mount option")
    Signed-off-by: Diogo Jahchan Koike <[email protected]>
    Signed-off-by: Konstantin Komarov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

PCI: Add ACS quirk for Qualcomm SA8775P [+ + +]

Author: Subramanian Ananthanarayanan <[email protected]>
Date:   Fri Sep 6 10:52:27 2024 +0530

    PCI: Add ACS quirk for Qualcomm SA8775P
    
    [ Upstream commit 026f84d3fa62d215b11cbeb5a5d97df941e93b5c ]
    
    The Qualcomm SA8775P root ports don't advertise an ACS capability, but they
    do provide ACS-like features to disable peer transactions and validate bus
    numbers in requests.
    
    Thus, add an ACS quirk for the SA8775P.
    
    Link: https://lore.kernel.org/linux-pci/[email protected]
    Signed-off-by: Subramanian Ananthanarayanan <[email protected]>
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

PCI: Add function 0 DMA alias quirk for Glenfly Arise chip [+ + +]

Author: WangYuli <[email protected]>
Date:   Fri Aug 23 17:57:08 2024 +0800

    PCI: Add function 0 DMA alias quirk for Glenfly Arise chip
    
    [ Upstream commit 9246b487ab3c3b5993aae7552b7a4c541cc14a49 ]
    
    Add DMA support for audio function of Glenfly Arise chip, which uses
    Requester ID of function 0.
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: SiyuLi <[email protected]>
    Signed-off-by: WangYuli <[email protected]>
    [bhelgaas: lower-case hex to match local code, drop unused Device IDs]
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Reviewed-by: Takashi Iwai <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

PCI: Mark Creative Labs EMU20k2 INTx masking as broken [+ + +]

Author: Alex Williamson <[email protected]>
Date:   Thu Sep 12 15:53:27 2024 -0600

    PCI: Mark Creative Labs EMU20k2 INTx masking as broken
    
    [ Upstream commit 2910306655a7072640021563ec9501bfa67f0cb1 ]
    
    Per user reports, the Creative Labs EMU20k2 (Sound Blaster X-Fi
    Titanium Series) generates spurious interrupts when used with
    vfio-pci unless DisINTx masking support is disabled.
    
    Thus, quirk the device to mark INTx masking as broken.
    
    Closes: https://lore.kernel.org/all/VI1PR10MB8207C507DB5420AB4C7281E0DB9A2@VI1PR10MB8207.EURPRD10.PROD.OUTLOOK.COM
    Link: https://lore.kernel.org/linux-pci/[email protected]
    Reported-by: zdravko delineshev <[email protected]>
    Signed-off-by: Alex Williamson <[email protected]>
    [kwilczynski: commit log]
    Signed-off-by: Krzysztof Wilczyński <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

pds_core: no health-thread in VF path [+ + +]

Author: Shannon Nelson <[email protected]>
Date:   Fri Feb 9 16:20:02 2024 -0800

    pds_core: no health-thread in VF path
    
    [ Upstream commit 3e36031cc0540ca97b615cbb940331892cbd3d21 ]
    
    The VFs don't run the health thread, so don't try to
    stop or restart the non-existent timer or work item.
    
    Fixes: d9407ff11809 ("pds_core: Prevent health thread from running during reset/remove")
    Reviewed-by: Brett Creeley <[email protected]>
    Signed-off-by: Shannon Nelson <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

perf sched: Fix memory leak in perf_sched__map() [+ + +]

Author: Yang Jihong <[email protected]>
Date:   Tue Feb 6 08:32:25 2024 +0000

    perf sched: Fix memory leak in perf_sched__map()
    
    [ Upstream commit ef76a5af819743d405674f6de5d0e63320ac653e ]
    
    perf_sched__map() needs to free memory of map_cpus, color_pids and
    color_cpus in normal path and rollback allocated memory in error path.
    
    Signed-off-by: Yang Jihong <[email protected]>
    Signed-off-by: Namhyung Kim <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 1a5efc9e13f3 ("libsubcmd: Don't free the usage string")
    Signed-off-by: Sasha Levin <[email protected]>

perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}() [+ + +]

Author: Yang Jihong <[email protected]>
Date:   Tue Feb 6 08:32:27 2024 +0000

    perf sched: Move curr_pid and cpu_last_switched initialization to perf_sched__{lat|map|replay}()
    
    [ Upstream commit bd2cdf26b9ea000339d54adc82e87fdbf22c21c3 ]
    
    The curr_pid and cpu_last_switched are used only for the
    'perf sched replay/latency/map'. Put their initialization in
    perf_sched__{lat|map|replay () to reduce unnecessary actions in other
    commands.
    
    Simple functional testing:
    
      # perf sched record perf bench sched messaging
      # Running 'sched/messaging' benchmark:
      # 20 sender and receiver processes per group
      # 10 groups == 400 processes run
    
           Total time: 0.209 [sec]
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 16.456 MB perf.data (147907 samples) ]
    
      # perf sched lat
    
       -------------------------------------------------------------------------------------------------------------------------------------------
        Task                  |   Runtime ms  | Switches | Avg delay ms    | Max delay ms    | Max delay start           | Max delay end          |
       -------------------------------------------------------------------------------------------------------------------------------------------
        sched-messaging:(401) |   2990.699 ms |    38705 | avg:   0.661 ms | max:  67.046 ms | max start: 456532.624830 s | max end: 456532.691876 s
        qemu-system-x86:(7)   |    179.764 ms |     2191 | avg:   0.152 ms | max:  21.857 ms | max start: 456532.576434 s | max end: 456532.598291 s
        sshd:48125            |      0.522 ms |        2 | avg:   0.037 ms | max:   0.046 ms | max start: 456532.514610 s | max end: 456532.514656 s
      <SNIP>
        ksoftirqd/11:82       |      0.063 ms |        1 | avg:   0.005 ms | max:   0.005 ms | max start: 456532.769366 s | max end: 456532.769371 s
        kworker/9:0-mm_:34624 |      0.233 ms |       20 | avg:   0.004 ms | max:   0.007 ms | max start: 456532.690804 s | max end: 456532.690812 s
        migration/13:93       |      0.000 ms |        1 | avg:   0.004 ms | max:   0.004 ms | max start: 456532.512669 s | max end: 456532.512674 s
       -----------------------------------------------------------------------------------------------------------------
        TOTAL:                |   3180.750 ms |    41368 |
       ---------------------------------------------------
    
      # echo $?
      0
    
      # perf sched map
        *A0                                                               456532.510141 secs A0 => migration/0:15
        *.                                                                456532.510171 secs .  => swapper:0
         .  *B0                                                           456532.510261 secs B0 => migration/1:21
         .  *.                                                            456532.510279 secs
      <SNIP>
         L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .   .   .    456532.785979 secs
         L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .   .    456532.786054 secs
         L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .   .    456532.786127 secs
         L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7  .    456532.786197 secs
         L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7  L7 *L7   456532.786270 secs
      # echo $?
      0
    
      # perf sched replay
      run measurement overhead: 108 nsecs
      sleep measurement overhead: 66473 nsecs
      the run test took 1000002 nsecs
      the sleep test took 1082686 nsecs
      nr_run_events:        49334
      nr_sleep_events:      50054
      nr_wakeup_events:     34701
      target-less wakeups:  165
      multi-target wakeups: 766
      task      0 (             swapper:         0), nr_events: 15419
      task      1 (             swapper:         1), nr_events: 1
      task      2 (             swapper:         2), nr_events: 1
      <SNIP>
      task    715 (     sched-messaging:    110248), nr_events: 1438
      task    716 (     sched-messaging:    110249), nr_events: 512
      task    717 (     sched-messaging:    110250), nr_events: 500
      task    718 (     sched-messaging:    110251), nr_events: 537
      task    719 (     sched-messaging:    110252), nr_events: 823
      ------------------------------------------------------------
      #1  : 1325.288, ravg: 1325.29, cpu: 7823.35 / 7823.35
      #2  : 1363.606, ravg: 1329.12, cpu: 7655.53 / 7806.56
      #3  : 1349.494, ravg: 1331.16, cpu: 7544.80 / 7780.39
      #4  : 1311.488, ravg: 1329.19, cpu: 7495.13 / 7751.86
      #5  : 1309.902, ravg: 1327.26, cpu: 7266.65 / 7703.34
      #6  : 1309.535, ravg: 1325.49, cpu: 7843.86 / 7717.39
      #7  : 1316.482, ravg: 1324.59, cpu: 7854.41 / 7731.09
      #8  : 1366.604, ravg: 1328.79, cpu: 7955.81 / 7753.57
      #9  : 1326.286, ravg: 1328.54, cpu: 7466.86 / 7724.90
      #10 : 1356.653, ravg: 1331.35, cpu: 7566.60 / 7709.07
      # echo $?
      0
    
    Signed-off-by: Yang Jihong <[email protected]>
    Signed-off-by: Namhyung Kim <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 1a5efc9e13f3 ("libsubcmd: Don't free the usage string")
    Signed-off-by: Sasha Levin <[email protected]>

perf sched: Move curr_thread initialization to perf_sched__map() [+ + +]

Author: Yang Jihong <[email protected]>
Date:   Tue Feb 6 08:32:26 2024 +0000

    perf sched: Move curr_thread initialization to perf_sched__map()
    
    [ Upstream commit 5e895278697c014e95ae7ae5e79a72ef68c5184e ]
    
    The curr_thread is used only for the 'perf sched map'. Put initialization
    in perf_sched__map() to reduce unnecessary actions in other commands.
    
    Simple functional testing:
    
      # perf sched record perf bench sched messaging
      # Running 'sched/messaging' benchmark:
      # 20 sender and receiver processes per group
      # 10 groups == 400 processes run
    
           Total time: 0.197 [sec]
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 15.526 MB perf.data (140095 samples) ]
    
      # perf sched map
        *A0                                                               451264.532445 secs A0 => migration/0:15
        *.                                                                451264.532468 secs .  => swapper:0
         .  *B0                                                           451264.532537 secs B0 => migration/1:21
         .  *.                                                            451264.532560 secs
         .   .  *C0                                                       451264.532644 secs C0 => migration/2:27
         .   .  *.                                                        451264.532668 secs
         .   .   .  *D0                                                   451264.532753 secs D0 => migration/3:33
         .   .   .  *.                                                    451264.532778 secs
         .   .   .   .  *E0                                               451264.532861 secs E0 => migration/4:39
         .   .   .   .  *.                                                451264.532886 secs
         .   .   .   .   .  *F0                                           451264.532973 secs F0 => migration/5:45
      <SNIP>
         A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .   .   .    451264.790785 secs
         A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .   .    451264.790858 secs
         A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .   .    451264.790934 secs
         A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .   .    451264.791004 secs
         A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .   .    451264.791075 secs
         A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .   .    451264.791143 secs
         A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .   .    451264.791232 secs
         A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .   .    451264.791336 secs
         A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .   .    451264.791407 secs
         A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7  .    451264.791484 secs
         A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7  A7 *A7   451264.791553 secs
      # echo $?
      0
    
    Signed-off-by: Yang Jihong <[email protected]>
    Signed-off-by: Namhyung Kim <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 1a5efc9e13f3 ("libsubcmd: Don't free the usage string")
    Signed-off-by: Sasha Levin <[email protected]>

perf sched: Move start_work_mutex and work_done_wait_mutex initialization to perf_sched__replay() [+ + +]

Author: Yang Jihong <[email protected]>
Date:   Tue Feb 6 08:32:24 2024 +0000

    perf sched: Move start_work_mutex and work_done_wait_mutex initialization to perf_sched__replay()
    
    [ Upstream commit c6907863519cf97ee09653cc8ec338a2328c2b6f ]
    
    The start_work_mutex and work_done_wait_mutex are used only for the
    'perf sched replay'. Put their initialization in perf_sched__replay () to
    reduce unnecessary actions in other commands.
    
    Simple functional testing:
    
      # perf sched record perf bench sched messaging
      # Running 'sched/messaging' benchmark:
      # 20 sender and receiver processes per group
      # 10 groups == 400 processes run
    
           Total time: 0.197 [sec]
      [ perf record: Woken up 1 times to write data ]
      [ perf record: Captured and wrote 14.952 MB perf.data (134165 samples) ]
    
      # perf sched replay
      run measurement overhead: 108 nsecs
      sleep measurement overhead: 65658 nsecs
      the run test took 999991 nsecs
      the sleep test took 1079324 nsecs
      nr_run_events:        42378
      nr_sleep_events:      43102
      nr_wakeup_events:     31852
      target-less wakeups:  17
      multi-target wakeups: 712
      task      0 (             swapper:         0), nr_events: 10451
      task      1 (             swapper:         1), nr_events: 3
      task      2 (             swapper:         2), nr_events: 1
      <SNIP>
      task    717 (     sched-messaging:     74483), nr_events: 152
      task    718 (     sched-messaging:     74484), nr_events: 1944
      task    719 (     sched-messaging:     74485), nr_events: 73
      task    720 (     sched-messaging:     74486), nr_events: 163
      task    721 (     sched-messaging:     74487), nr_events: 942
      task    722 (     sched-messaging:     74488), nr_events: 78
      task    723 (     sched-messaging:     74489), nr_events: 1090
      ------------------------------------------------------------
      #1  : 1366.507, ravg: 1366.51, cpu: 7682.70 / 7682.70
      #2  : 1410.072, ravg: 1370.86, cpu: 7723.88 / 7686.82
      #3  : 1396.296, ravg: 1373.41, cpu: 7568.20 / 7674.96
      #4  : 1381.019, ravg: 1374.17, cpu: 7531.81 / 7660.64
      #5  : 1393.826, ravg: 1376.13, cpu: 7725.25 / 7667.11
      #6  : 1401.581, ravg: 1378.68, cpu: 7594.82 / 7659.88
      #7  : 1381.337, ravg: 1378.94, cpu: 7371.22 / 7631.01
      #8  : 1373.842, ravg: 1378.43, cpu: 7894.92 / 7657.40
      #9  : 1364.697, ravg: 1377.06, cpu: 7324.91 / 7624.15
      #10 : 1363.613, ravg: 1375.72, cpu: 7209.55 / 7582.69
      # echo $?
      0
    
    Signed-off-by: Yang Jihong <[email protected]>
    Signed-off-by: Namhyung Kim <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 1a5efc9e13f3 ("libsubcmd: Don't free the usage string")
    Signed-off-by: Sasha Levin <[email protected]>

phonet: Handle error of rtnl_register_module(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Oct 8 11:47:37 2024 -0700

    phonet: Handle error of rtnl_register_module().
    
    [ Upstream commit b5e837c86041bef60f36cf9f20a641a30764379a ]
    
    Before commit addf9b90de22 ("net: rtnetlink: use rcu to free rtnl
    message handlers"), once the first rtnl_register_module() allocated
    rtnl_msg_handlers[PF_PHONET], the following calls never failed.
    
    However, after the commit, rtnl_register_module() could fail silently
    to allocate rtnl_msg_handlers[PF_PHONET][msgtype] and requires error
    handling for each call.
    
    Handling the error allows users to view a module as an all-or-nothing
    thing in terms of the rtnetlink functionality.  This prevents syzkaller
    from reporting spurious errors from its tests, where OOM often occurs
    and module is automatically loaded.
    
    Let's use rtnl_register_many() to handle the errors easily.
    
    Fixes: addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message handlers")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Acked-by: Rémi Denis-Courmont <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

phonet: no longer hold RTNL in route_dumpit() [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Tue May 7 12:17:48 2024 +0000

    phonet: no longer hold RTNL in route_dumpit()
    
    [ Upstream commit 58a4ff5d77b187086eb12d41d613749420947f19 ]
    
    route_dumpit() already relies on RCU, RTNL is not needed.
    
    Also change return value at the end of a dump.
    This allows NLMSG_DONE to be appended to the current
    skb at the end of a dump, saving a couple of recvmsg()
    system calls.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Cc: Remi Denis-Courmont <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Stable-dep-of: b5e837c86041 ("phonet: Handle error of rtnl_register_module().")
    Signed-off-by: Sasha Levin <[email protected]>

phy: qualcomm: eusb2-repeater: Rework init to drop redundant zero-out loop [+ + +]

Author: Abel Vesa <[email protected]>
Date:   Thu Feb 1 10:39:33 2024 +0200

    phy: qualcomm: eusb2-repeater: Rework init to drop redundant zero-out loop
    
    [ Upstream commit 734550d60cdf634299f0eac7f7fe15763ed990bb ]
    
    Instead of incrementing the base of the global reg fields, which renders
    the second instance of the repeater broken due to wrong offsets, use
    regmap with base and offset. As for zeroing out the rest of the tuning
    regs, avoid looping though the table and just use the table as is,
    as it is already zero initialized.
    
    Fixes: 99a517a582fc ("phy: qualcomm: phy-qcom-eusb2-repeater: Zero out untouched tuning regs")
    Tested-by: Elliot Berman <[email protected]> # sm8650-qrd
    Signed-off-by: Abel Vesa <[email protected]>
    Link: https://lore.kernel.org/r/20240201-phy-qcom-eusb2-repeater-fixes-v4-1-cf18c8cef6d7@linaro.org
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

phy: qualcomm: phy-qcom-eusb2-repeater: Add tuning overrides [+ + +]

Author: Konrad Dybcio <[email protected]>
Date:   Wed Sep 13 11:53:26 2023 +0200

    phy: qualcomm: phy-qcom-eusb2-repeater: Add tuning overrides
    
    [ Upstream commit 56156a76e765d32009fee058697c591194d0829f ]
    
    There are devices in the wild, like the Sony Xperia 1 V that *require*
    different tuning than the base design for USB to work.
    
    Add support for overriding the necessary tuning values.
    
    Signed-off-by: Konrad Dybcio <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Stable-dep-of: 734550d60cdf ("phy: qualcomm: eusb2-repeater: Rework init to drop redundant zero-out loop")
    Signed-off-by: Sasha Levin <[email protected]>

platform/x86/intel/tpmi: Add defines to get version information [+ + +]

Author: Srinivas Pandruvada <[email protected]>
Date:   Tue Oct 3 11:49:14 2023 -0700

    platform/x86/intel/tpmi: Add defines to get version information
    
    [ Upstream commit 8874e414fe78718d0f2861fe511cecbd1cd73f4d ]
    
    Add defines to get major and minor version from a TPMI version field
    value. This will avoid code duplication to convert in every feature
    driver. Also add define for invalid version field.
    
    Signed-off-by: Srinivas Pandruvada <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Ilpo Järvinen <[email protected]>
    Signed-off-by: Ilpo Järvinen <[email protected]>
    Stable-dep-of: 1d390923974c ("powercap: intel_rapl_tpmi: Ignore minor version change")
    Signed-off-by: Sasha Levin <[email protected]>

powercap: intel_rapl_tpmi: Fix bogus register reading [+ + +]

Author: Zhang Rui <[email protected]>
Date:   Mon Sep 30 16:17:56 2024 +0800

    powercap: intel_rapl_tpmi: Fix bogus register reading
    
    commit 91e8f835a7eda4ba2c0c4002a3108a0e3b22d34e upstream.
    
    The TPMI_RAPL_REG_DOMAIN_INFO value needs to be multiplied by 8 to get
    the register offset.
    
    Cc: All applicable <[email protected]>
    Fixes: 903eb9fb85e3 ("powercap: intel_rapl_tpmi: Fix System Domain probing")
    Signed-off-by: Zhang Rui <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

powercap: intel_rapl_tpmi: Ignore minor version change [+ + +]

Author: Zhang Rui <[email protected]>
Date:   Mon Sep 30 16:17:58 2024 +0800

    powercap: intel_rapl_tpmi: Ignore minor version change
    
    [ Upstream commit 1d390923974cc233245649cf23833e06b15a9ef7 ]
    
    The hardware definition of every TPMI feature contains a major and minor
    version. When there is a change in the MMIO offset or change in the
    definition of a field, hardware will change major version. For addition
    of new fields without modifying existing MMIO offsets or fields, only
    the minor version is changed.
    
    If the driver has not been updated to recognize a new hardware major
    version, it cannot provide the RAPL interface to users due to possible
    register layout incompatibilities. However, the driver does not need to
    be updated every time the hardware minor version changes because in that
    case it will just miss some new functionality exposed by the hardware.
    
    The current implementation causes the driver to refuse to work for any
    hardware version change which is unnecessarily restrictive.
    
    If there is a minor version mismatch, log an information message and
    continue, but if there is a major version mismatch, log a warning and
    exit (as before).
    
    Signed-off-by: Zhang Rui <[email protected]>
    Reviewed-by: Srinivas Pandruvada <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Fixes: 9eef7f9da928 ("powercap: intel_rapl: Introduce RAPL TPMI interface driver")
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

ppp: fix ppp_async_encode() illegal access [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Wed Oct 9 18:58:02 2024 +0000

    ppp: fix ppp_async_encode() illegal access
    
    [ Upstream commit 40dddd4b8bd08a69471efd96107a4e1c73fabefc ]
    
    syzbot reported an issue in ppp_async_encode() [1]
    
    In this case, pppoe_sendmsg() is called with a zero size.
    Then ppp_async_encode() is called with an empty skb.
    
    BUG: KMSAN: uninit-value in ppp_async_encode drivers/net/ppp/ppp_async.c:545 [inline]
     BUG: KMSAN: uninit-value in ppp_async_push+0xb4f/0x2660 drivers/net/ppp/ppp_async.c:675
      ppp_async_encode drivers/net/ppp/ppp_async.c:545 [inline]
      ppp_async_push+0xb4f/0x2660 drivers/net/ppp/ppp_async.c:675
      ppp_async_send+0x130/0x1b0 drivers/net/ppp/ppp_async.c:634
      ppp_channel_bridge_input drivers/net/ppp/ppp_generic.c:2280 [inline]
      ppp_input+0x1f1/0xe60 drivers/net/ppp/ppp_generic.c:2304
      pppoe_rcv_core+0x1d3/0x720 drivers/net/ppp/pppoe.c:379
      sk_backlog_rcv+0x13b/0x420 include/net/sock.h:1113
      __release_sock+0x1da/0x330 net/core/sock.c:3072
      release_sock+0x6b/0x250 net/core/sock.c:3626
      pppoe_sendmsg+0x2b8/0xb90 drivers/net/ppp/pppoe.c:903
      sock_sendmsg_nosec net/socket.c:729 [inline]
      __sock_sendmsg+0x30f/0x380 net/socket.c:744
      ____sys_sendmsg+0x903/0xb60 net/socket.c:2602
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656
      __sys_sendmmsg+0x3c1/0x960 net/socket.c:2742
      __do_sys_sendmmsg net/socket.c:2771 [inline]
      __se_sys_sendmmsg net/socket.c:2768 [inline]
      __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768
      x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Uninit was created at:
      slab_post_alloc_hook mm/slub.c:4092 [inline]
      slab_alloc_node mm/slub.c:4135 [inline]
      kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4187
      kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:587
      __alloc_skb+0x363/0x7b0 net/core/skbuff.c:678
      alloc_skb include/linux/skbuff.h:1322 [inline]
      sock_wmalloc+0xfe/0x1a0 net/core/sock.c:2732
      pppoe_sendmsg+0x3a7/0xb90 drivers/net/ppp/pppoe.c:867
      sock_sendmsg_nosec net/socket.c:729 [inline]
      __sock_sendmsg+0x30f/0x380 net/socket.c:744
      ____sys_sendmsg+0x903/0xb60 net/socket.c:2602
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656
      __sys_sendmmsg+0x3c1/0x960 net/socket.c:2742
      __do_sys_sendmmsg net/socket.c:2771 [inline]
      __se_sys_sendmmsg net/socket.c:2768 [inline]
      __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768
      x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    CPU: 1 UID: 0 PID: 5411 Comm: syz.1.14 Not tainted 6.12.0-rc1-syzkaller-00165-g360c1f1f24c6 #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
    
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Reported-by: [email protected]
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Simon Horman <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

rcu/nocb: Fix rcuog wake-up from offline softirq [+ + +]

Author: Frederic Weisbecker <[email protected]>
Date:   Thu Oct 10 18:36:09 2024 +0200

    rcu/nocb: Fix rcuog wake-up from offline softirq
    
    [ Upstream commit f7345ccc62a4b880cf76458db5f320725f28e400 ]
    
    After a CPU has set itself offline and before it eventually calls
    rcutree_report_cpu_dead(), there are still opportunities for callbacks
    to be enqueued, for example from a softirq. When that happens on NOCB,
    the rcuog wake-up is deferred through an IPI to an online CPU in order
    not to call into the scheduler and risk arming the RT-bandwidth after
    hrtimers have been migrated out and disabled.
    
    But performing a synchronized IPI from a softirq is buggy as reported in
    the following scenario:
    
            WARNING: CPU: 1 PID: 26 at kernel/smp.c:633 smp_call_function_single
            Modules linked in: rcutorture torture
            CPU: 1 UID: 0 PID: 26 Comm: migration/1 Not tainted 6.11.0-rc1-00012-g9139f93209d1 #1
            Stopper: multi_cpu_stop+0x0/0x320 <- __stop_cpus+0xd0/0x120
            RIP: 0010:smp_call_function_single
            <IRQ>
            swake_up_one_online
            __call_rcu_nocb_wake
            __call_rcu_common
            ? rcu_torture_one_read
            call_timer_fn
            __run_timers
            run_timer_softirq
            handle_softirqs
            irq_exit_rcu
            ? tick_handle_periodic
            sysvec_apic_timer_interrupt
            </IRQ>
    
    Fix this with forcing deferred rcuog wake up through the NOCB timer when
    the CPU is offline. The actual wake up will happen from
    rcutree_report_cpu_dead().
    
    Reported-by: kernel test robot <[email protected]>
    Closes: https://lore.kernel.org/oe-lkp/[email protected]
    Fixes: 9139f93209d1 ("rcu/nocb: Fix RT throttling hrtimer armed from offline CPU")
    Reviewed-by: "Joel Fernandes (Google)" <[email protected]>
    Signed-off-by: Frederic Weisbecker <[email protected]>
    Signed-off-by: Neeraj Upadhyay <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

rcu/nocb: Make IRQs disablement symmetric [+ + +]

Author: Frederic Weisbecker <[email protected]>
Date:   Tue Jan 9 23:24:00 2024 +0100

    rcu/nocb: Make IRQs disablement symmetric
    
    [ Upstream commit b913c3fe685e0aec80130975b0f330fd709ff324 ]
    
    Currently IRQs are disabled on call_rcu() and then depending on the
    context:
    
    * If the CPU is in nocb mode:
    
       - If the callback is enqueued in the bypass list, IRQs are re-enabled
         implictly by rcu_nocb_try_bypass()
    
       - If the callback is enqueued in the normal list, IRQs are re-enabled
         implicitly by __call_rcu_nocb_wake()
    
    * If the CPU is NOT in nocb mode, IRQs are reenabled explicitly from call_rcu()
    
    This makes the code a bit hard to follow, especially as it interleaves
    with nocb locking.
    
    To make the IRQ flags coverage clearer and also in order to prepare for
    moving all the nocb enqueue code to its own function, always re-enable
    the IRQ flags explicitly from call_rcu().
    
    Reviewed-by: Neeraj Upadhyay (AMD) <[email protected]>
    Signed-off-by: Frederic Weisbecker <[email protected]>
    Reviewed-by: Paul E. McKenney <[email protected]>
    Signed-off-by: Boqun Feng <[email protected]>
    Stable-dep-of: f7345ccc62a4 ("rcu/nocb: Fix rcuog wake-up from offline softirq")
    Signed-off-by: Sasha Levin <[email protected]>

RDMA/mad: Improve handling of timed out WRs of mad agent [+ + +]

Author: Saravanan Vajravel <[email protected]>
Date:   Mon Jul 22 16:33:25 2024 +0530

    RDMA/mad: Improve handling of timed out WRs of mad agent
    
    [ Upstream commit 2a777679b8ccd09a9a65ea0716ef10365179caac ]
    
    Current timeout handler of mad agent acquires/releases mad_agent_priv
    lock for every timed out WRs. This causes heavy locking contention
    when higher no. of WRs are to be handled inside timeout handler.
    
    This leads to softlockup with below trace in some use cases where
    rdma-cm path is used to establish connection between peer nodes
    
    Trace:
    -----
     BUG: soft lockup - CPU#4 stuck for 26s! [kworker/u128:3:19767]
     CPU: 4 PID: 19767 Comm: kworker/u128:3 Kdump: loaded Tainted: G OE
         -------  ---  5.14.0-427.13.1.el9_4.x86_64 #1
     Hardware name: Dell Inc. PowerEdge R740/01YM03, BIOS 2.4.8 11/26/2019
     Workqueue: ib_mad1 timeout_sends [ib_core]
     RIP: 0010:__do_softirq+0x78/0x2ac
     RSP: 0018:ffffb253449e4f98 EFLAGS: 00000246
     RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 000000000000001f
     RDX: 000000000000001d RSI: 000000003d1879ab RDI: fff363b66fd3a86b
     RBP: ffffb253604cbcd8 R08: 0000009065635f3b R09: 0000000000000000
     R10: 0000000000000040 R11: ffffb253449e4ff8 R12: 0000000000000000
     R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000040
     FS:  0000000000000000(0000) GS:ffff8caa1fc80000(0000) knlGS:0000000000000000
     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
     CR2: 00007fd9ec9db900 CR3: 0000000891934006 CR4: 00000000007706e0
     DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
     DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
     PKRU: 55555554
     Call Trace:
      <IRQ>
      ? show_trace_log_lvl+0x1c4/0x2df
      ? show_trace_log_lvl+0x1c4/0x2df
      ? __irq_exit_rcu+0xa1/0xc0
      ? watchdog_timer_fn+0x1b2/0x210
      ? __pfx_watchdog_timer_fn+0x10/0x10
      ? __hrtimer_run_queues+0x127/0x2c0
      ? hrtimer_interrupt+0xfc/0x210
      ? __sysvec_apic_timer_interrupt+0x5c/0x110
      ? sysvec_apic_timer_interrupt+0x37/0x90
      ? asm_sysvec_apic_timer_interrupt+0x16/0x20
      ? __do_softirq+0x78/0x2ac
      ? __do_softirq+0x60/0x2ac
      __irq_exit_rcu+0xa1/0xc0
      sysvec_call_function_single+0x72/0x90
      </IRQ>
      <TASK>
      asm_sysvec_call_function_single+0x16/0x20
     RIP: 0010:_raw_spin_unlock_irq+0x14/0x30
     RSP: 0018:ffffb253604cbd88 EFLAGS: 00000247
     RAX: 000000000001960d RBX: 0000000000000002 RCX: ffff8cad2a064800
     RDX: 000000008020001b RSI: 0000000000000001 RDI: ffff8cad5d39f66c
     RBP: ffff8cad5d39f600 R08: 0000000000000001 R09: 0000000000000000
     R10: ffff8caa443e0c00 R11: ffffb253604cbcd8 R12: ffff8cacb8682538
     R13: 0000000000000005 R14: ffffb253604cbd90 R15: ffff8cad5d39f66c
      cm_process_send_error+0x122/0x1d0 [ib_cm]
      timeout_sends+0x1dd/0x270 [ib_core]
      process_one_work+0x1e2/0x3b0
      ? __pfx_worker_thread+0x10/0x10
      worker_thread+0x50/0x3a0
      ? __pfx_worker_thread+0x10/0x10
      kthread+0xdd/0x100
      ? __pfx_kthread+0x10/0x10
      ret_from_fork+0x29/0x50
      </TASK>
    
    Simplified timeout handler by creating local list of timed out WRs
    and invoke send handler post creating the list. The new method acquires/
    releases lock once to fetch the list and hence helps to reduce locking
    contetiong when processing higher no. of WRs
    
    Signed-off-by: Saravanan Vajravel <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults [+ + +]

Author: Michael Guralnik <[email protected]>
Date:   Mon Sep 9 13:05:00 2024 +0300

    RDMA/mlx5: Enforce umem boundaries for explicit ODP page faults
    
    [ Upstream commit 8c6d097d830f779fc1725fbaa1314f20a7a07b4b ]
    
    The new memory scheme page faults are requesting the driver to fetch
    additinal pages to the faulted memory access.
    This is done in order to prefetch pages before and after the area that
    got the page fault, assuming this will reduce the total amount of page
    faults.
    
    The driver should ensure it handles only the pages that are within the
    umem range.
    
    Signed-off-by: Michael Guralnik <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

RDMA/rtrs-srv: Avoid null pointer deref during path establishment [+ + +]

Author: Md Haris Iqbal <[email protected]>
Date:   Wed Aug 21 13:22:14 2024 +0200

    RDMA/rtrs-srv: Avoid null pointer deref during path establishment
    
    [ Upstream commit d0e62bf7b575fbfe591f6f570e7595dd60a2f5eb ]
    
    For RTRS path establishment, RTRS client initiates and completes con_num
    of connections. After establishing all its connections, the information
    is exchanged between the client and server through the info_req message.
    During this exchange, it is essential that all connections have been
    established, and the state of the RTRS srv path is CONNECTED.
    
    So add these sanity checks, to make sure we detect and abort process in
    error scenarios to avoid null pointer deref.
    
    Signed-off-by: Md Haris Iqbal <[email protected]>
    Signed-off-by: Jack Wang <[email protected]>
    Signed-off-by: Grzegorz Prajsner <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Leon Romanovsky <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

remoteproc: imx_rproc: Use imx specific hook for find_loaded_rsc_table [+ + +]

Author: Peng Fan <[email protected]>
Date:   Fri Jul 19 16:36:12 2024 +0800

    remoteproc: imx_rproc: Use imx specific hook for find_loaded_rsc_table
    
    [ Upstream commit e954a1bd16102abc800629f9900715d8ec4c3130 ]
    
    If there is a resource table device tree node, use the address as
    the resource table address, otherwise use the address(where
    .resource_table section loaded) inside the Cortex-M elf file.
    
    And there is an update in NXP SDK that Resource Domain Control(RDC)
    enabled to protect TCM, linux not able to write the TCM space when
    updating resource table status and cause kernel dump. So use the address
    from device tree could avoid kernel dump.
    
    Note: NXP M4 SDK not check resource table update, so it does not matter
    use whether resource table address specified in elf file or in device
    tree. But to reflect the fact that if people specific resource table
    address in device tree, it means people are aware and going to use it,
    not the address specified in elf file.
    
    Reviewed-by: Iuliana Prodan <[email protected]>
    Signed-off-by: Peng Fan <[email protected]>
    Reviewed-by: Daniel Baluta <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Mathieu Poirier <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Revert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled" [+ + +]

Author: Jakub Kicinski <[email protected]>
Date:   Fri Oct 4 07:21:15 2024 -0700

    Revert "net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled"
    
    [ Upstream commit 5546da79e6cc5bb3324bf25688ed05498fd3f86d ]
    
    This reverts commit b514c47ebf41a6536551ed28a05758036e6eca7c.
    
    The commit describes that we don't have to sync the page when
    recycling, and it tries to optimize that case. But we do need
    to sync after allocation. Recycling side should be changed to
    pass the right sync size instead.
    
    Fixes: b514c47ebf41 ("net: stmmac: set PP_FLAG_DMA_SYNC_DEV only if XDP is enabled")
    Reported-by: Jon Hunter <[email protected]>
    Link: https://lore.kernel.org/[email protected]
    Reviewed-by: Jacob Keller <[email protected]>
    Reviewed-by: Furong Xu <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Revert "PCI/MSI: Provide stubs for IMS functions" [+ + +]

Author: Bjorn Helgaas <[email protected]>
Date:   Wed Apr 10 17:13:01 2024 -0500

    Revert "PCI/MSI: Provide stubs for IMS functions"
    
    [ Upstream commit 372c669271bff736c5bc275c982d8d1b4f1f147c ]
    
    This reverts commit 41efa431244f6498833ff8ee8dde28c4924c5479.
    
    IMS (Interrupt Message Store) support appeared in v6.2, but there are no
    users yet.
    
    Remove it for now.  We can add it back when a user comes along.  If this is
    re-added later, this could be squashed with these commits:
    
      0194425af0c8 ("PCI/MSI: Provide IMS (Interrupt Message Store) support")
      c9e5bea27383 ("PCI/MSI: Provide pci_ims_alloc/free_irq()")
    
    which added the non-stub implementations.
    
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Bjorn Helgaas <[email protected]>
    Reviewed-by: Kevin Tian <[email protected]>
    Reviewed-by: Thomas Gleixner <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

Revert "powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2" [+ + +]

Author: Geoff Levand <[email protected]>
Date:   Fri Jan 19 10:27:53 2024 +0000

    Revert "powerpc/ps3_defconfig: Disable PPC64_BIG_ENDIAN_ELF_ABI_V2"
    
    [ Upstream commit 914d081ead115f7ba685ab57f977716bdd09c894 ]
    
    This reverts commit 482b718a84f08b6fc84879c3e90cc57dba11c115.
    
    The preceding commits by Nicholas Piggin enable PS3 support for ELFv2,
    so there's no need to disable it for PS3 anymore.
    
    Signed-off-by: Geoff Levand <[email protected]>
    Signed-off-by: Michael Ellerman <[email protected]>
    Link: https://msgid.link/983836405df1b6001a2262972fb32d1aee97d6f5.1705654669.git.geoff@infradead.org
    Signed-off-by: Sasha Levin <[email protected]>

Revert "usb: yurex: Replace snprintf() with the safer scnprintf() variant" [+ + +]

Author: Oliver Neukum <[email protected]>
Date:   Mon Oct 7 11:39:47 2024 +0200

    Revert "usb: yurex: Replace snprintf() with the safer scnprintf() variant"
    
    commit 71c717cd8a2e180126932cc6851ff21c1d04d69a upstream.
    
    This reverts commit 86b20af11e84c26ae3fde4dcc4f490948e3f8035.
    
    This patch leads to passing 0 to simple_read_from_buffer()
    as a fifth argument, turning the read method into a nop.
    The change is fundamentally flawed, as it breaks the driver.
    
    Signed-off-by: Oliver Neukum <[email protected]>
    Cc: stable <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t [+ + +]

Author: Palmer Dabbelt <[email protected]>
Date:   Wed Jul 31 09:22:00 2024 -0700

    RISC-V: Don't have MAX_PHYSMEM_BITS exceed phys_addr_t
    
    [ Upstream commit ad380f6a0a5e82e794b45bb2eaec24ed51a56846 ]
    
    I recently ended up with a warning on some compilers along the lines of
    
          CC      kernel/resource.o
        In file included from include/linux/ioport.h:16,
                         from kernel/resource.c:15:
        kernel/resource.c: In function 'gfr_start':
        include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow]
           49 |         ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); })
              |                                     ^
        include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique'
           52 |         __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_))
              |         ^~~~~~~~~~~~~~~~~
        include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once'
          161 | #define min_t(type, x, y) __cmp_once(min, type, x, y)
              |                           ^~~~~~~~~~
        kernel/resource.c:1829:23: note: in expansion of macro 'min_t'
         1829 |                 end = min_t(resource_size_t, base->end,
              |                       ^~~~~
        kernel/resource.c: In function 'gfr_continue':
        include/linux/minmax.h:49:37: error: conversion from 'long long unsigned int' to 'resource_size_t' {aka 'unsigned int'} changes value from '17179869183' to '4294967295' [-Werror=overflow]
           49 |         ({ type ux = (x); type uy = (y); __cmp(op, ux, uy); })
              |                                     ^
        include/linux/minmax.h:52:9: note: in expansion of macro '__cmp_once_unique'
           52 |         __cmp_once_unique(op, type, x, y, __UNIQUE_ID(x_), __UNIQUE_ID(y_))
              |         ^~~~~~~~~~~~~~~~~
        include/linux/minmax.h:161:27: note: in expansion of macro '__cmp_once'
          161 | #define min_t(type, x, y) __cmp_once(min, type, x, y)
              |                           ^~~~~~~~~~
        kernel/resource.c:1847:24: note: in expansion of macro 'min_t'
         1847 |                addr <= min_t(resource_size_t, base->end,
              |                        ^~~~~
        cc1: all warnings being treated as errors
    
    which looks like a real problem: our phys_addr_t is only 32 bits now, so
    having 34-bit masks is just going to result in overflows.
    
    Reviewed-by: Charlie Jenkins <[email protected]>
    Reviewed-by: Alexandre Ghiti <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown [+ + +]

Author: Ying Sun <[email protected]>
Date:   Thu Jul 11 08:32:36 2024 +0000

    riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown
    
    [ Upstream commit c6ebf2c528470a09be77d0d9df2c6617ea037ac5 ]
    
    Runs on the kernel with CONFIG_RISCV_ALTERNATIVE enabled:
      kexec -sl vmlinux
    
    Error:
      kexec_image: Unknown rela relocation: 34
      kexec_image: Error loading purgatory ret=-8
    and
      kexec_image: Unknown rela relocation: 38
      kexec_image: Error loading purgatory ret=-8
    
    The purgatory code uses the 16-bit addition and subtraction relocation
    type, but not handled, resulting in kexec_file_load failure.
    So add handle to arch_kexec_apply_relocations_add().
    
    Tested on RISC-V64 Qemu-virt, issue fixed.
    
    Co-developed-by: Petr Tesarik <[email protected]>
    Signed-off-by: Petr Tesarik <[email protected]>
    Signed-off-by: Ying Sun <[email protected]>
    Reviewed-by: Andrew Jones <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

riscv: avoid Imbalance in RAS [+ + +]

Author: Jisheng Zhang <[email protected]>
Date:   Sun Jul 21 01:06:59 2024 +0800

    riscv: avoid Imbalance in RAS
    
    [ Upstream commit 8f1534e7440382d118c3d655d3a6014128b2086d ]
    
    Inspired by[1], modify the code to remove the code of modifying ra to
    avoid imbalance RAS (return address stack) which may lead to incorret
    predictions on return.
    
    Link: https://lore.kernel.org/linux-riscv/[email protected]/ [1]
    Signed-off-by: Jisheng Zhang <[email protected]>
    Reviewed-by: Cyril Bur <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

riscv: cpufeature: Fix thead vector hwcap removal [+ + +]

Author: Charlie Jenkins <[email protected]>
Date:   Thu May 2 21:50:50 2024 -0700

    riscv: cpufeature: Fix thead vector hwcap removal
    
    [ Upstream commit e482eab4d1eb31031eff2b6afb71776483101979 ]
    
    The riscv_cpuinfo struct that contains mvendorid and marchid is not
    populated until all harts are booted which happens after the DT parsing.
    Use the mvendorid/marchid from the boot hart to determine if the DT
    contains an invalid V.
    
    Fixes: d82f32202e0d ("RISC-V: Ignore V from the riscv,isa DT property on older T-Head CPUs")
    Signed-off-by: Charlie Jenkins <[email protected]>
    Reviewed-by: Conor Dooley <[email protected]>
    Reviewed-by: Guo Ren <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

riscv: Remove SHADOW_OVERFLOW_STACK_SIZE macro [+ + +]

Author: Song Shuai <[email protected]>
Date:   Mon Dec 11 19:03:31 2023 +0800

    riscv: Remove SHADOW_OVERFLOW_STACK_SIZE macro
    
    [ Upstream commit a7565f4d068b2e60f95c3223c3167c40b8fe83ae ]
    
    The commit be97d0db5f44 ("riscv: VMAP_STACK overflow
    detection thread-safe") got rid of `shadow_stack`,
    so SHADOW_OVERFLOW_STACK_SIZE should be removed too.
    
    Fixes: be97d0db5f44 ("riscv: VMAP_STACK overflow detection thread-safe")
    Signed-off-by: Song Shuai <[email protected]>
    Reviewed-by: Sami Tolvanen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Palmer Dabbelt <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

rtnetlink: Add bulk registration helpers for rtnetlink message handlers. [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Oct 8 11:47:32 2024 -0700

    rtnetlink: Add bulk registration helpers for rtnetlink message handlers.
    
    [ Upstream commit 07cc7b0b942bf55ef1a471470ecda8d2a6a6541f ]
    
    Before commit addf9b90de22 ("net: rtnetlink: use rcu to free rtnl message
    handlers"), once rtnl_msg_handlers[protocol] was allocated, the following
    rtnl_register_module() for the same protocol never failed.
    
    However, after the commit, rtnl_msg_handler[protocol][msgtype] needs to
    be allocated in each rtnl_register_module(), so each call could fail.
    
    Many callers of rtnl_register_module() do not handle the returned error,
    and we need to add many error handlings.
    
    To handle that easily, let's add wrapper functions for bulk registration
    of rtnetlink message handlers.
    
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Stable-dep-of: 78b7b991838a ("vxlan: Handle error of rtnl_register_module().")
    Signed-off-by: Sasha Levin <[email protected]>

rtnetlink: add RTNL_FLAG_DUMP_UNLOCKED flag [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Thu Feb 22 10:50:15 2024 +0000

    rtnetlink: add RTNL_FLAG_DUMP_UNLOCKED flag
    
    [ Upstream commit 386520e0ecc01004d3a29c70c5a77d4bbf8a8420 ]
    
    Similarly to RTNL_FLAG_DOIT_UNLOCKED, this new flag
    allows dump operations registered via rtnl_register()
    or rtnl_register_module() to opt-out from RTNL protection.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Reviewed-by: Donald Hunter <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Stable-dep-of: 5be2062e3080 ("mpls: Handle error of rtnl_register_module().")
    Signed-off-by: Sasha Levin <[email protected]>

rtnetlink: change nlk->cb_mutex role [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Thu Feb 22 10:50:14 2024 +0000

    rtnetlink: change nlk->cb_mutex role
    
    [ Upstream commit e39951d965bf58b5aba7f61dc1140dcb8271af22 ]
    
    In commit af65bdfce98d ("[NETLINK]: Switch cb_lock spinlock
    to mutex and allow to override it"), Patrick McHardy used
    a common mutex to protect both nlk->cb and the dump() operations.
    
    The override is used for rtnl dumps, registered with
    rntl_register() and rntl_register_module().
    
    We want to be able to opt-out some dump() operations
    to not acquire RTNL, so we need to protect nlk->cb
    with a per socket mutex.
    
    This patch renames nlk->cb_def_mutex to nlk->nl_cb_mutex
    
    The optional pointer to the mutex used to protect dump()
    call is stored in nlk->dump_cb_mutex
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Stable-dep-of: 5be2062e3080 ("mpls: Handle error of rtnl_register_module().")
    Signed-off-by: Sasha Levin <[email protected]>

rxrpc: Fix uninitialised variable in rxrpc_send_data() [+ + +]

Author: David Howells <[email protected]>
Date:   Tue Oct 1 14:26:59 2024 +0100

    rxrpc: Fix uninitialised variable in rxrpc_send_data()
    
    [ Upstream commit 7a310f8d7dfe2d92a1f31ddb5357bfdd97eed273 ]
    
    Fix the uninitialised txb variable in rxrpc_send_data() by moving the code
    that loads it above all the jumps to maybe_error, txb being stored back
    into call->tx_pending right before the normal return.
    
    Fixes: b0f571ecd794 ("rxrpc: Fix locking in rxrpc's sendmsg")
    Reported-by: Dan Carpenter <[email protected]>
    Closes: https://lists.infradead.org/pipermail/linux-afs/2024-October/008896.html
    Signed-off-by: David Howells <[email protected]>
    cc: Marc Dionne <[email protected]>
    cc: [email protected]
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

s390/cpum_sf: Remove WARN_ON_ONCE statements [+ + +]

Author: Thomas Richter <[email protected]>
Date:   Wed Jul 10 12:23:47 2024 +0200

    s390/cpum_sf: Remove WARN_ON_ONCE statements
    
    [ Upstream commit b495e710157606889f2d8bdc62aebf2aa02f67a7 ]
    
    Remove WARN_ON_ONCE statements. These have not triggered in the
    past.
    
    Signed-off-by: Thomas Richter <[email protected]>
    Acked-by: Sumanth Korikkar <[email protected]>
    Cc: Heiko Carstens <[email protected]>
    Cc: Vasily Gorbik <[email protected]>
    Cc: Alexander Gordeev <[email protected]>
    Signed-off-by: Vasily Gorbik <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

s390/facility: Disable compile time optimization for decompressor code [+ + +]

Author: Heiko Carstens <[email protected]>
Date:   Wed Sep 4 11:39:24 2024 +0200

    s390/facility: Disable compile time optimization for decompressor code
    
    [ Upstream commit 0147addc4fb72a39448b8873d8acdf3a0f29aa65 ]
    
    Disable compile time optimizations of test_facility() for the
    decompressor. The decompressor should not contain any optimized code
    depending on the architecture level set the kernel image is compiled
    for to avoid unexpected operation exceptions.
    
    Add a __DECOMPRESSOR check to test_facility() to enforce that
    facilities are always checked during runtime for the decompressor.
    
    Reviewed-by: Sven Schnelle <[email protected]>
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

s390/mm: Add cond_resched() to cmm_alloc/free_pages() [+ + +]

Author: Gerald Schaefer <[email protected]>
Date:   Mon Sep 2 14:02:19 2024 +0200

    s390/mm: Add cond_resched() to cmm_alloc/free_pages()
    
    [ Upstream commit 131b8db78558120f58c5dc745ea9655f6b854162 ]
    
    Adding/removing large amount of pages at once to/from the CMM balloon
    can result in rcu_sched stalls or workqueue lockups, because of busy
    looping w/o cond_resched().
    
    Prevent this by adding a cond_resched(). cmm_free_pages() holds a
    spin_lock while looping, so it cannot be added directly to the existing
    loop. Instead, introduce a wrapper function that operates on maximum 256
    pages at once, and add it there.
    
    Signed-off-by: Gerald Schaefer <[email protected]>
    Reviewed-by: Heiko Carstens <[email protected]>
    Signed-off-by: Heiko Carstens <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd() [+ + +]

Author: Justin Tee <[email protected]>
Date:   Thu Sep 12 16:24:40 2024 -0700

    scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
    
    [ Upstream commit 93bcc5f3984bf4f51da1529700aec351872dbfff ]
    
    During HBA stress testing, a spam of received PLOGIs exposes a resource
    recovery bug causing leakage of lpfc_sqlq entries from the global
    phba->sli4_hba.lpfc_els_sgl_list.
    
    The issue is in lpfc_els_flush_cmd(), where the driver attempts to recover
    outstanding ELS sgls when walking the txcmplq.  Only CMD_ELS_REQUEST64_CRs
    and CMD_GEN_REQUEST64_CRs are added to the abort and cancel lists.  A check
    for CMD_XMIT_ELS_RSP64_WQE is missing in order to recover LS_ACC usages of
    the phba->sli4_hba.lpfc_els_sgl_list too.
    
    Fix by adding CMD_XMIT_ELS_RSP64_WQE as part of the txcmplq walk when
    adding WQEs to the abort and cancel list in lpfc_els_flush_cmd().  Also,
    update naming convention from CRs to WQEs.
    
    Signed-off-by: Justin Tee <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance [+ + +]

Author: Justin Tee <[email protected]>
Date:   Thu Sep 12 16:24:44 2024 -0700

    scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
    
    [ Upstream commit 0a3c84f71680684c1d41abb92db05f95c09111e8 ]
    
    Deleting an NPIV instance requires all fabric ndlps to be released before
    an NPIV's resources can be torn down.  Failure to release fabric ndlps
    beforehand opens kref imbalance race conditions.  Fix by forcing the DA_ID
    to complete synchronously with usage of wait_queue.
    
    Signed-off-by: Justin Tee <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: Remove scsi device no_start_on_resume flag [+ + +]

Author: Damien Le Moal <[email protected]>
Date:   Sat Aug 26 12:48:33 2023 +0900

    scsi: Remove scsi device no_start_on_resume flag
    
    [ Upstream commit c4367ac83805a2322268c9736cd8ef9124063424 ]
    
    The scsi device flag no_start_on_resume is not set by any scsi low
    level driver. Remove it. This reverts the changes introduced by commit
    0a8589055936 ("ata,scsi: do not issue START STOP UNIT on resume").
    
    Signed-off-by: Damien Le Moal <[email protected]>
    Reviewed-by: Hannes Reinecke <[email protected]>
    Tested-by: Chia-Lin Kao (AceLan) <[email protected]>
    Tested-by: Geert Uytterhoeven <[email protected]>
    Reviewed-by: Martin K. Petersen <[email protected]>
    Stable-dep-of: 7a6bbc2829d4 ("scsi: sd: Do not repeat the starting disk message")
    Signed-off-by: Sasha Levin <[email protected]>

scsi: Revert "scsi: sd: Do not repeat the starting disk message" [+ + +]

Author: Johan Hovold <[email protected]>
Date:   Tue Jul 16 18:11:01 2024 +0200

    scsi: Revert "scsi: sd: Do not repeat the starting disk message"
    
    commit da3e19ef0b3de0aa4b25595bdc214c02a04f19b8 upstream.
    
    This reverts commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4.
    
    The offending commit tried to suppress a double "Starting disk" message for
    some drivers, but instead started spamming the log with bogus messages
    every five seconds:
    
            [  311.798956] sd 0:0:0:0: [sda] Starting disk
            [  316.919103] sd 0:0:0:0: [sda] Starting disk
            [  322.040775] sd 0:0:0:0: [sda] Starting disk
            [  327.161140] sd 0:0:0:0: [sda] Starting disk
            [  332.281352] sd 0:0:0:0: [sda] Starting disk
            [  337.401878] sd 0:0:0:0: [sda] Starting disk
            [  342.521527] sd 0:0:0:0: [sda] Starting disk
            [  345.850401] sd 0:0:0:0: [sda] Starting disk
            [  350.967132] sd 0:0:0:0: [sda] Starting disk
            [  356.090454] sd 0:0:0:0: [sda] Starting disk
            ...
    
    on machines that do not actually stop the disk on runtime suspend (e.g.
    the Qualcomm sc8280xp CRD with UFS).
    
    Let's just revert for now to address the regression.
    
    Fixes: 7a6bbc2829d4 ("scsi: sd: Do not repeat the starting disk message")
    Cc: [email protected]
    Signed-off-by: Johan Hovold <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Bart Van Assche <[email protected]>
    Reviewed-by: Damien Le Moal <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: sd: Do not repeat the starting disk message [+ + +]

Author: Damien Le Moal <[email protected]>
Date:   Tue Jul 2 06:53:26 2024 +0900

    scsi: sd: Do not repeat the starting disk message
    
    [ Upstream commit 7a6bbc2829d4ab592c7e440a6f6f5deb3cd95db4 ]
    
    The SCSI disk message "Starting disk" to signal resuming of a suspended
    disk is printed in both sd_resume() and sd_resume_common() which results
    in this message being printed twice when resuming from e.g. autosuspend:
    
    $ echo 5000 > /sys/block/sda/device/power/autosuspend_delay_ms
    $ echo auto > /sys/block/sda/device/power/control
    
    [ 4962.438293] sd 0:0:0:0: [sda] Synchronizing SCSI cache
    [ 4962.501121] sd 0:0:0:0: [sda] Stopping disk
    
    $ echo on > /sys/block/sda/device/power/control
    
    [ 4972.805851] sd 0:0:0:0: [sda] Starting disk
    [ 4980.558806] sd 0:0:0:0: [sda] Starting disk
    
    Fix this double print by removing the call to sd_printk() from sd_resume()
    and moving the call to sd_printk() in sd_resume_common() earlier in the
    function, before the check using sd_do_start_stop().  Doing so, the message
    is printed once regardless if sd_resume_common() actually executes
    sd_start_stop_device() (i.e. SCSI device case) or not (libsas and libata
    managed ATA devices case).
    
    Fixes: 0c76106cb975 ("scsi: sd: Fix TCG OPAL unlock on system resume")
    Cc: [email protected]
    Signed-off-by: Damien Le Moal <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reviewed-by: Bart Van Assche <[email protected]>
    Reviewed-by: John Garry <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

scsi: ufs: Use pre-calculated offsets in ufshcd_init_lrb() [+ + +]

Author: Avri Altman <[email protected]>
Date:   Tue Sep 10 07:45:43 2024 +0300

    scsi: ufs: Use pre-calculated offsets in ufshcd_init_lrb()
    
    commit d5130c5a093257aa4542aaded8034ef116a7624a upstream.
    
    Replace manual offset calculations for response_upiu and prd_table in
    ufshcd_init_lrb() with pre-calculated offsets already stored in the
    utp_transfer_req_desc structure. The pre-calculated offsets are set
    differently in ufshcd_host_memory_configure() based on the
    UFSHCD_QUIRK_PRDT_BYTE_GRAN quirk, ensuring correct alignment and
    access.
    
    Fixes: 26f968d7de82 ("scsi: ufs: Introduce UFSHCD_QUIRK_PRDT_BYTE_GRAN quirk")
    Cc: [email protected]
    Signed-off-by: Avri Altman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Acked-by: Bart Van Assche <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

scsi: wd33c93: Don't use stale scsi_pointer value [+ + +]

Author: Daniel Palmer <[email protected]>
Date:   Thu Oct 3 13:29:47 2024 +1000

    scsi: wd33c93: Don't use stale scsi_pointer value
    
    commit 9023ed8d91eb1fcc93e64dc4962f7412b1c4cbec upstream.
    
    A regression was introduced with commit dbb2da557a6a ("scsi: wd33c93:
    Move the SCSI pointer to private command data") which results in an oops
    in wd33c93_intr(). That commit added the scsi_pointer variable and
    initialized it from hostdata->connected. However, during selection,
    hostdata->connected is not yet valid. Fix this by getting the current
    scsi_pointer from hostdata->selecting.
    
    Cc: Daniel Palmer <[email protected]>
    Cc: Michael Schmitz <[email protected]>
    Cc: [email protected]
    Fixes: dbb2da557a6a ("scsi: wd33c93: Move the SCSI pointer to private command data")
    Signed-off-by: Daniel Palmer <[email protected]>
    Co-developed-by: Finn Thain <[email protected]>
    Signed-off-by: Finn Thain <[email protected]>
    Link: https://lore.kernel.org/r/09e11a0a54e6aa2a88bd214526d305aaf018f523.1727926187.git.fthain@linux-m68k.org
    Reviewed-by: Michael Schmitz <[email protected]>
    Reviewed-by: Bart Van Assche <[email protected]>
    Signed-off-by: Martin K. Petersen <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

sctp: ensure sk_state is set to CLOSED if hashing fails in sctp_listen_start [+ + +]

Author: Xin Long <[email protected]>
Date:   Mon Oct 7 12:25:11 2024 -0400

    sctp: ensure sk_state is set to CLOSED if hashing fails in sctp_listen_start
    
    [ Upstream commit 4d5c70e6155d5eae198bade4afeab3c1b15073b6 ]
    
    If hashing fails in sctp_listen_start(), the socket remains in the
    LISTENING state, even though it was not added to the hash table.
    This can lead to a scenario where a socket appears to be listening
    without actually being accessible.
    
    This patch ensures that if the hashing operation fails, the sk_state
    is set back to CLOSED before returning an error.
    
    Note that there is no need to undo the autobind operation if hashing
    fails, as the bind port can still be used for next listen() call on
    the same socket.
    
    Fixes: 76c6d988aeb3 ("sctp: add sock_reuseport for the sock in __sctp_hash_endpoint")
    Reported-by: Marcelo Ricardo Leitner <[email protected]>
    Signed-off-by: Xin Long <[email protected]>
    Acked-by: Marcelo Ricardo Leitner <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

secretmem: disable memfd_secret() if arch cannot set direct map [+ + +]

Author: Patrick Roy <[email protected]>
Date:   Tue Oct 1 09:00:41 2024 +0100

    secretmem: disable memfd_secret() if arch cannot set direct map
    
    commit 532b53cebe58f34ce1c0f34d866f5c0e335c53c6 upstream.
    
    Return -ENOSYS from memfd_secret() syscall if !can_set_direct_map().  This
    is the case for example on some arm64 configurations, where marking 4k
    PTEs in the direct map not present can only be done if the direct map is
    set up at 4k granularity in the first place (as ARM's break-before-make
    semantics do not easily allow breaking apart large/gigantic pages).
    
    More precisely, on arm64 systems with !can_set_direct_map(),
    set_direct_map_invalid_noflush() is a no-op, however it returns success
    (0) instead of an error.  This means that memfd_secret will seemingly
    "work" (e.g.  syscall succeeds, you can mmap the fd and fault in pages),
    but it does not actually achieve its goal of removing its memory from the
    direct map.
    
    Note that with this patch, memfd_secret() will start erroring on systems
    where can_set_direct_map() returns false (arm64 with
    CONFIG_RODATA_FULL_DEFAULT_ENABLED=n, CONFIG_DEBUG_PAGEALLOC=n and
    CONFIG_KFENCE=n), but that still seems better than the current silent
    failure.  Since CONFIG_RODATA_FULL_DEFAULT_ENABLED defaults to 'y', most
    arm64 systems actually have a working memfd_secret() and aren't be
    affected.
    
    From going through the iterations of the original memfd_secret patch
    series, it seems that disabling the syscall in these scenarios was the
    intended behavior [1] (preferred over having
    set_direct_map_invalid_noflush return an error as that would result in
    SIGBUSes at page-fault time), however the check for it got dropped between
    v16 [2] and v17 [3], when secretmem moved away from CMA allocations.
    
    [1]: https://lore.kernel.org/lkml/[email protected]/
    [2]: https://lore.kernel.org/lkml/[email protected]/#t
    [3]: https://lore.kernel.org/lkml/[email protected]/
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1507f51255c9 ("mm: introduce memfd_secret system call to create "secret" memory areas")
    Signed-off-by: Patrick Roy <[email protected]>
    Reviewed-by: Mike Rapoport (Microsoft) <[email protected]>
    Cc: Alexander Graf <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: James Gowans <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test [+ + +]

Author: Daniel Borkmann <[email protected]>
Date:   Fri Sep 13 21:17:51 2024 +0200

    selftests/bpf: Fix ARG_PTR_TO_LONG {half-,}uninitialized test
    
    [ Upstream commit b8e188f023e07a733b47d5865311ade51878fe40 ]
    
    The assumption of 'in privileged mode reads from uninitialized stack locations
    are permitted' is not quite correct since the verifier was probing for read
    access rather than write access. Both tests need to be annotated as __success
    for privileged and unprivileged.
    
    Signed-off-by: Daniel Borkmann <[email protected]>
    Acked-by: Andrii Nakryiko <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test [+ + +]

Author: Donet Tom <[email protected]>
Date:   Fri Sep 27 00:07:52 2024 -0500

    selftests/mm: fix incorrect buffer->mirror size in hmm2 double_map test
    
    commit 76503e1fa1a53ef041a120825d5ce81c7fe7bdd7 upstream.
    
    The hmm2 double_map test was failing due to an incorrect buffer->mirror
    size.  The buffer->mirror size was 6, while buffer->ptr size was 6 *
    PAGE_SIZE.  The test failed because the kernel's copy_to_user function was
    attempting to copy a 6 * PAGE_SIZE buffer to buffer->mirror.  Since the
    size of buffer->mirror was incorrect, copy_to_user failed.
    
    This patch corrects the buffer->mirror size to 6 * PAGE_SIZE.
    
    Test Result without this patch
    ==============================
     #  RUN           hmm2.hmm2_device_private.double_map ...
     # hmm-tests.c:1680:double_map:Expected ret (-14) == 0 (0)
     # double_map: Test terminated by assertion
     #          FAIL  hmm2.hmm2_device_private.double_map
     not ok 53 hmm2.hmm2_device_private.double_map
    
    Test Result with this patch
    ===========================
     #  RUN           hmm2.hmm2_device_private.double_map ...
     #            OK  hmm2.hmm2_device_private.double_map
     ok 53 hmm2.hmm2_device_private.double_map
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: fee9f6d1b8df ("mm/hmm/test: add selftests for HMM")
    Signed-off-by: Donet Tom <[email protected]>
    Reviewed-by: Muhammad Usama Anjum <[email protected]>
    Cc: Jérôme Glisse <[email protected]>
    Cc: Kees Cook <[email protected]>
    Cc: Mark Brown <[email protected]>
    Cc: Przemek Kitszel <[email protected]>
    Cc: Ritesh Harjani (IBM) <[email protected]>
    Cc: Shuah Khan <[email protected]>
    Cc: Ralph Campbell <[email protected]>
    Cc: Jason Gunthorpe <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests/rseq: Fix mm_cid test failure [+ + +]

Author: Mathieu Desnoyers <[email protected]>
Date:   Tue Oct 8 21:28:01 2024 -0400

    selftests/rseq: Fix mm_cid test failure
    
    commit a0cc649353bb726d4aa0db60dce467432197b746 upstream.
    
    Adapt the rseq.c/rseq.h code to follow GNU C library changes introduced by:
    
    glibc commit 2e456ccf0c34 ("Linux: Make __rseq_size useful for feature detection (bug 31965)")
    
    Without this fix, rseq selftests for mm_cid fail:
    
    ./run_param_test.sh
    Default parameters
    Running test spinlock
    Running compare-twice test spinlock
    Running mm_cid test spinlock
    Error: cpu id getter unavailable
    
    Fixes: 18c2355838e7 ("selftests/rseq: Implement rseq mm_cid field support")
    Signed-off-by: Mathieu Desnoyers <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    CC: Boqun Feng <[email protected]>
    CC: "Paul E. McKenney" <[email protected]>
    Cc: Shuah Khan <[email protected]>
    CC: Carlos O'Donell <[email protected]>
    CC: Florian Weimer <[email protected]>
    CC: [email protected]
    CC: [email protected]
    Signed-off-by: Shuah Khan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

selftests: Introduce Makefile variable to list shared bash scripts [+ + +]

Author: Benjamin Poirier <[email protected]>
Date:   Fri Jan 26 18:21:18 2024 -0500

    selftests: Introduce Makefile variable to list shared bash scripts
    
    [ Upstream commit 2a0683be5b4c9829e8335e494a21d1148e832822 ]
    
    Some tests written in bash source other files in a parent directory. For
    example, drivers/net/bonding/dev_addr_lists.sh sources
    net/forwarding/lib.sh. If a subset of tests is exported and run outside the
    source tree (for example by using `make -C tools/testing/selftests gen_tar
    TARGETS="drivers/net/bonding"`), these other files must be made available
    as well.
    
    Commit ae108c48b5d2 ("selftests: net: Fix cross-tree inclusion of scripts")
    addressed this problem by symlinking and copying the sourced files but this
    only works for direct dependencies. Commit 25ae948b4478 ("selftests/net:
    add lib.sh") changed net/forwarding/lib.sh to source net/lib.sh. As a
    result, that latter file must be included as well when the former is
    exported. This was not handled and was reverted in commit 2114e83381d3
    ("selftests: forwarding: Avoid failures to source net/lib.sh"). In order to
    allow reinstating the inclusion of net/lib.sh from net/forwarding/lib.sh,
    add a mechanism to list dependent files in a new Makefile variable and
    export them. This allows sourcing those files using the same expression
    whether tests are run in-tree or exported.
    
    Dependencies are not resolved recursively so transitive dependencies must
    be listed in TEST_INCLUDES. For example, if net/forwarding/lib.sh sources
    net/lib.sh; the Makefile related to a test that sources
    net/forwarding/lib.sh from a parent directory must list:
    TEST_INCLUDES := \
            ../../../net/forwarding/lib.sh \
            ../../../net/lib.sh
    
    v2:
    Fix rst syntax in Documentation/dev-tools/kselftest.rst (Jakub Kicinski)
    
    v1 (from RFC):
    * changed TEST_INCLUDES to take relative paths, like other TEST_* variables
      (Vladimir Oltean)
    * preserved common "$(MAKE) OUTPUT=... -C ... target" ordering in Makefile
      (Petr Machata)
    
    Signed-off-by: Benjamin Poirier <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: net: no_forwarding: fix VID for $swp2 in one_bridge_two_pvids() test [+ + +]

Author: Kacper Ludwinski <[email protected]>
Date:   Wed Oct 2 14:10:16 2024 +0900

    selftests: net: no_forwarding: fix VID for $swp2 in one_bridge_two_pvids() test
    
    [ Upstream commit 9f49d14ec41ce7be647028d7d34dea727af55272 ]
    
    Currently, the second bridge command overwrites the first one.
    Fix this by adding this VID to the interface behind $swp2.
    
    The one_bridge_two_pvids() test intends to check that there is no
    leakage of traffic between bridge ports which have a single VLAN - the
    PVID VLAN.
    
    Because of a typo, port $swp1 is configured with a PVID twice (second
    command overwrites first), and $swp2 isn't configured at all (and since
    the bridge vlan_default_pvid property is set to 0, this port will not
    have a PVID at all, so it will drop all untagged and priority-tagged
    traffic).
    
    So, instead of testing the configuration that was intended, we are
    testing a different one, where one port has PVID 2 and the other has
    no PVID. This incorrect version of the test should also pass, but is
    ineffective for its purpose, so fix the typo.
    
    This typo has an impact on results of the test,
    potentially leading to wrong conclusions regarding
    the functionality of a network device.
    
    The tests results:
    
    TEST: Switch ports in VLAN-aware bridge with different PVIDs:
            Unicast non-IP untagged   [ OK ]
            Multicast non-IP untagged   [ OK ]
            Broadcast non-IP untagged   [ OK ]
            Unicast IPv4 untagged   [ OK ]
            Multicast IPv4 untagged   [ OK ]
            Unicast IPv6 untagged   [ OK ]
            Multicast IPv6 untagged   [ OK ]
            Unicast non-IP VID 1   [ OK ]
            Multicast non-IP VID 1   [ OK ]
            Broadcast non-IP VID 1   [ OK ]
            Unicast IPv4 VID 1   [ OK ]
            Multicast IPv4 VID 1   [ OK ]
            Unicast IPv6 VID 1   [ OK ]
            Multicast IPv6 VID 1   [ OK ]
            Unicast non-IP VID 4094   [ OK ]
            Multicast non-IP VID 4094   [ OK ]
            Broadcast non-IP VID 4094   [ OK ]
            Unicast IPv4 VID 4094   [ OK ]
            Multicast IPv4 VID 4094   [ OK ]
            Unicast IPv6 VID 4094   [ OK ]
            Multicast IPv6 VID 4094   [ OK ]
    
    Fixes: 476a4f05d9b8 ("selftests: forwarding: add a no_forwarding.sh test")
    Reviewed-by: Hangbin Liu <[email protected]>
    Reviewed-by: Shuah Khan <[email protected]>
    Signed-off-by: Kacper Ludwinski <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

selftests: net: Remove executable bits from library scripts [+ + +]

Author: Benjamin Poirier <[email protected]>
Date:   Wed Jan 31 09:08:46 2024 -0500

    selftests: net: Remove executable bits from library scripts
    
    [ Upstream commit 9d851dd4dab63e95c1911a2fa847796d1ec5d58d ]
    
    setup_loopback.sh and net_helper.sh are meant to be sourced from other
    scripts, not executed directly. Therefore, remove the executable bits from
    those files' permissions.
    
    This change is similar to commit 49078c1b80b6 ("selftests: forwarding:
    Remove executable bits from lib.sh")
    
    Fixes: 7d1575014a63 ("selftests/net: GRO coalesce test")
    Fixes: 3bdd9fd29cb0 ("selftests/net: synchronize udpgro tests' tx and rx connection")
    Suggested-by: Paolo Abeni <[email protected]>
    Signed-off-by: Benjamin Poirier <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

serial: protect uart_port_dtr_rts() in uart_shutdown() too [+ + +]

Author: Jiri Slaby (SUSE) <[email protected]>
Date:   Mon Aug 5 12:20:35 2024 +0200

    serial: protect uart_port_dtr_rts() in uart_shutdown() too
    
    [ Upstream commit 602babaa84d627923713acaf5f7e9a4369e77473 ]
    
    Commit af224ca2df29 (serial: core: Prevent unsafe uart port access, part
    3) added few uport == NULL checks. It added one to uart_shutdown(), so
    the commit assumes, uport can be NULL in there. But right after that
    protection, there is an unprotected "uart_port_dtr_rts(uport, false);"
    call. That is invoked only if HUPCL is set, so I assume that is the
    reason why we do not see lots of these reports.
    
    Or it cannot be NULL at this point at all for some reason :P.
    
    Until the above is investigated, stay on the safe side and move this
    dereference to the if too.
    
    I got this inconsistency from Coverity under CID 1585130. Thanks.
    
    Signed-off-by: Jiri Slaby (SUSE) <[email protected]>
    Cc: Peter Hurley <[email protected]>
    Cc: Greg Kroah-Hartman <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

slip: make slhc_remember() more robust against malicious packets [+ + +]

Author: Eric Dumazet <[email protected]>
Date:   Wed Oct 9 09:11:32 2024 +0000

    slip: make slhc_remember() more robust against malicious packets
    
    [ Upstream commit 7d3fce8cbe3a70a1c7c06c9b53696be5d5d8dd5c ]
    
    syzbot found that slhc_remember() was missing checks against
    malicious packets [1].
    
    slhc_remember() only checked the size of the packet was at least 20,
    which is not good enough.
    
    We need to make sure the packet includes the IPv4 and TCP header
    that are supposed to be carried.
    
    Add iph and th pointers to make the code more readable.
    
    [1]
    
    BUG: KMSAN: uninit-value in slhc_remember+0x2e8/0x7b0 drivers/net/slip/slhc.c:666
      slhc_remember+0x2e8/0x7b0 drivers/net/slip/slhc.c:666
      ppp_receive_nonmp_frame+0xe45/0x35e0 drivers/net/ppp/ppp_generic.c:2455
      ppp_receive_frame drivers/net/ppp/ppp_generic.c:2372 [inline]
      ppp_do_recv+0x65f/0x40d0 drivers/net/ppp/ppp_generic.c:2212
      ppp_input+0x7dc/0xe60 drivers/net/ppp/ppp_generic.c:2327
      pppoe_rcv_core+0x1d3/0x720 drivers/net/ppp/pppoe.c:379
      sk_backlog_rcv+0x13b/0x420 include/net/sock.h:1113
      __release_sock+0x1da/0x330 net/core/sock.c:3072
      release_sock+0x6b/0x250 net/core/sock.c:3626
      pppoe_sendmsg+0x2b8/0xb90 drivers/net/ppp/pppoe.c:903
      sock_sendmsg_nosec net/socket.c:729 [inline]
      __sock_sendmsg+0x30f/0x380 net/socket.c:744
      ____sys_sendmsg+0x903/0xb60 net/socket.c:2602
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656
      __sys_sendmmsg+0x3c1/0x960 net/socket.c:2742
      __do_sys_sendmmsg net/socket.c:2771 [inline]
      __se_sys_sendmmsg net/socket.c:2768 [inline]
      __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768
      x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    Uninit was created at:
      slab_post_alloc_hook mm/slub.c:4091 [inline]
      slab_alloc_node mm/slub.c:4134 [inline]
      kmem_cache_alloc_node_noprof+0x6bf/0xb80 mm/slub.c:4186
      kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:587
      __alloc_skb+0x363/0x7b0 net/core/skbuff.c:678
      alloc_skb include/linux/skbuff.h:1322 [inline]
      sock_wmalloc+0xfe/0x1a0 net/core/sock.c:2732
      pppoe_sendmsg+0x3a7/0xb90 drivers/net/ppp/pppoe.c:867
      sock_sendmsg_nosec net/socket.c:729 [inline]
      __sock_sendmsg+0x30f/0x380 net/socket.c:744
      ____sys_sendmsg+0x903/0xb60 net/socket.c:2602
      ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2656
      __sys_sendmmsg+0x3c1/0x960 net/socket.c:2742
      __do_sys_sendmmsg net/socket.c:2771 [inline]
      __se_sys_sendmmsg net/socket.c:2768 [inline]
      __x64_sys_sendmmsg+0xbc/0x120 net/socket.c:2768
      x64_sys_call+0xb6e/0x3ba0 arch/x86/include/generated/asm/syscalls_64.h:308
      do_syscall_x64 arch/x86/entry/common.c:52 [inline]
      do_syscall_64+0xcd/0x1e0 arch/x86/entry/common.c:83
     entry_SYSCALL_64_after_hwframe+0x77/0x7f
    
    CPU: 0 UID: 0 PID: 5460 Comm: syz.2.33 Not tainted 6.12.0-rc2-syzkaller-00006-g87d6aab2389e #0
    Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
    
    Fixes: b5451d783ade ("slip: Move the SLIP drivers")
    Reported-by: [email protected]
    Closes: https://lore.kernel.org/netdev/[email protected]/T/#u
    Signed-off-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

smb: client: fix UAF in async decryption [+ + +]

Author: Enzo Matsumiya <[email protected]>
Date:   Thu Sep 26 14:46:13 2024 -0300

    smb: client: fix UAF in async decryption
    
    [ Upstream commit b0abcd65ec545701b8793e12bc27dc98042b151a ]
    
    Doing an async decryption (large read) crashes with a
    slab-use-after-free way down in the crypto API.
    
    Reproducer:
        # mount.cifs -o ...,seal,esize=1 //srv/share /mnt
        # dd if=/mnt/largefile of=/dev/null
        ...
        [  194.196391] ==================================================================
        [  194.196844] BUG: KASAN: slab-use-after-free in gf128mul_4k_lle+0xc1/0x110
        [  194.197269] Read of size 8 at addr ffff888112bd0448 by task kworker/u77:2/899
        [  194.197707]
        [  194.197818] CPU: 12 UID: 0 PID: 899 Comm: kworker/u77:2 Not tainted 6.11.0-lku-00028-gfca3ca14a17a-dirty #43
        [  194.198400] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014
        [  194.199046] Workqueue: smb3decryptd smb2_decrypt_offload [cifs]
        [  194.200032] Call Trace:
        [  194.200191]  <TASK>
        [  194.200327]  dump_stack_lvl+0x4e/0x70
        [  194.200558]  ? gf128mul_4k_lle+0xc1/0x110
        [  194.200809]  print_report+0x174/0x505
        [  194.201040]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
        [  194.201352]  ? srso_return_thunk+0x5/0x5f
        [  194.201604]  ? __virt_addr_valid+0xdf/0x1c0
        [  194.201868]  ? gf128mul_4k_lle+0xc1/0x110
        [  194.202128]  kasan_report+0xc8/0x150
        [  194.202361]  ? gf128mul_4k_lle+0xc1/0x110
        [  194.202616]  gf128mul_4k_lle+0xc1/0x110
        [  194.202863]  ghash_update+0x184/0x210
        [  194.203103]  shash_ahash_update+0x184/0x2a0
        [  194.203377]  ? __pfx_shash_ahash_update+0x10/0x10
        [  194.203651]  ? srso_return_thunk+0x5/0x5f
        [  194.203877]  ? crypto_gcm_init_common+0x1ba/0x340
        [  194.204142]  gcm_hash_assoc_remain_continue+0x10a/0x140
        [  194.204434]  crypt_message+0xec1/0x10a0 [cifs]
        [  194.206489]  ? __pfx_crypt_message+0x10/0x10 [cifs]
        [  194.208507]  ? srso_return_thunk+0x5/0x5f
        [  194.209205]  ? srso_return_thunk+0x5/0x5f
        [  194.209925]  ? srso_return_thunk+0x5/0x5f
        [  194.210443]  ? srso_return_thunk+0x5/0x5f
        [  194.211037]  decrypt_raw_data+0x15f/0x250 [cifs]
        [  194.212906]  ? __pfx_decrypt_raw_data+0x10/0x10 [cifs]
        [  194.214670]  ? srso_return_thunk+0x5/0x5f
        [  194.215193]  smb2_decrypt_offload+0x12a/0x6c0 [cifs]
    
    This is because TFM is being used in parallel.
    
    Fix this by allocating a new AEAD TFM for async decryption, but keep
    the existing one for synchronous READ cases (similar to what is done
    in smb3_calc_signature()).
    
    Also remove the calls to aead_request_set_callback() and
    crypto_wait_req() since it's always going to be a synchronous operation.
    
    Signed-off-by: Enzo Matsumiya <[email protected]>
    Signed-off-by: Steve French <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

soundwire: cadence: re-check Peripheral status with delayed_work [+ + +]

Author: Pierre-Louis Bossart <[email protected]>
Date:   Mon Aug 5 19:49:21 2024 +0800

    soundwire: cadence: re-check Peripheral status with delayed_work
    
    [ Upstream commit f8c35d61ba01afa76846905c67862cdace7f66b0 ]
    
    The SoundWire peripheral enumeration is entirely based on interrupts,
    more specifically sticky bits tracking state changes.
    
    This patch adds a defensive programming check on the actual status
    reported in PING frames. If for some reason an interrupt was lost or
    delayed, the delayed work would detect a peripheral change of status
    after the bus starts.
    
    The 100ms defined for the delay is not completely arbitrary, if a
    Peripheral didn't join the bus within that delay then probably the
    hardware link is broken, and conversely if the detection didn't happen
    because of software issues the 100ms is still acceptable in terms of
    user experience.
    
    The overhead of the one-shot workqueue is minimal, and the mutual
    exclusion ensures that the interrupt and delayed work cannot update
    the status concurrently.
    
    Reviewed-by: Liam Girdwood <[email protected]>
    Signed-off-by: Pierre-Louis Bossart <[email protected]>
    Signed-off-by: Bard Liao <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

soundwire: intel_bus_common: enable interrupts before exiting reset [+ + +]

Author: Pierre-Louis Bossart <[email protected]>
Date:   Mon Aug 5 19:50:03 2024 +0800

    soundwire: intel_bus_common: enable interrupts before exiting reset
    
    [ Upstream commit 5aedb8d8336b0a0421b58ca27d1b572aa6695b5b ]
    
    The existing code enables the Cadence IP interrupts after the bus
    reset sequence. The problem with this sequence is that it might be
    pre-empted, giving SoundWire devices time to sync and report as
    ATTACHED before the interrupts are enabled. In that case, the Cadence
    IP will not detect a state change and will not throw an interrupt to
    proceed with the enumeration of a Device0.
    
    In our overnight stress tests, we observed that a slight
    sub-millisecond delay in enabling interrupts after the reset was
    correlated with detection failures. This problem is more prevalent on
    the LunarLake silicon, likely due to SOC integration changes, but it
    was observed on earlier generations as well.
    
    This patch reverts the sequence, with the interrupts enabled before
    performing the bus reset. This removes the race condition and makes
    sure the Cadence IP is able to detect the presence of a Device0 in all
    cases.
    
    Signed-off-by: Pierre-Louis Bossart <[email protected]>
    Signed-off-by: Bard Liao <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Vinod Koul <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

spi: spi-fsl-lpspi: remove redundant spi_controller_put call [+ + +]

Author: Carlos Song <[email protected]>
Date:   Wed Apr 3 16:40:29 2024 +0800

    spi: spi-fsl-lpspi: remove redundant spi_controller_put call
    
    [ Upstream commit bff892acf79cec531da6cb21c50980a584ce1476 ]
    
    devm_spi_alloc_controller will allocate an SPI controller and
    automatically release a reference on it when dev is unbound from
    its driver. It doesn't need to call spi_controller_put explicitly
    to put the reference when lpspi driver failed initialization.
    
    Fixes: 2ae0ab0143fc ("spi: lpspi: Avoid potential use-after-free in probe()")
    Signed-off-by: Carlos Song <[email protected]>
    Reviewed-by: Alexander Sverdlin <[email protected]>
    Link: https://msgid.link/r/[email protected]
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

staging: vme_user: added bound check to geoid [+ + +]

Author: Riyan Dhiman <[email protected]>
Date:   Tue Aug 27 18:26:05 2024 +0530

    staging: vme_user: added bound check to geoid
    
    [ Upstream commit a8a8b54350229f59c8ba6496fb5689a1632a59be ]
    
    The geoid is a module parameter that allows users to hardcode the slot number.
    A bound check for geoid was added in the probe function because only values
    between 0 and less than VME_MAX_SLOT are valid.
    
    Signed-off-by: Riyan Dhiman <[email protected]>
    Reviewed-by: Dan Carpenter <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

SUNRPC: Fix integer overflow in decode_rc_list() [+ + +]

Author: Dan Carpenter <[email protected]>
Date:   Thu Sep 19 11:50:33 2024 +0300

    SUNRPC: Fix integer overflow in decode_rc_list()
    
    [ Upstream commit 6dbf1f341b6b35bcc20ff95b6b315e509f6c5369 ]
    
    The math in "rc_list->rcl_nrefcalls * 2 * sizeof(uint32_t)" could have an
    integer overflow.  Add bounds checking on rc_list->rcl_nrefcalls to fix
    that.
    
    Fixes: 4aece6a19cf7 ("nfs41: cb_sequence xdr implementation")
    Signed-off-by: Dan Carpenter <[email protected]>
    Signed-off-by: Anna Schumaker <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tcp: fix tcp_enter_recovery() to zero retrans_stamp when it's safe [+ + +]

Author: Neal Cardwell <[email protected]>
Date:   Tue Oct 1 20:05:16 2024 +0000

    tcp: fix tcp_enter_recovery() to zero retrans_stamp when it's safe
    
    [ Upstream commit b41b4cbd9655bcebcce941bef3601db8110335be ]
    
    Fix tcp_enter_recovery() so that if there are no retransmits out then
    we zero retrans_stamp when entering fast recovery. This is necessary
    to fix two buggy behaviors.
    
    Currently a non-zero retrans_stamp value can persist across multiple
    back-to-back loss recovery episodes. This is because we generally only
    clears retrans_stamp if we are completely done with loss recoveries,
    and get to tcp_try_to_open() and find !tcp_any_retrans_done(sk). This
    behavior causes two bugs:
    
    (1) When a loss recovery episode (CA_Loss or CA_Recovery) is followed
    immediately by a new CA_Recovery, the retrans_stamp value can persist
    and can be a time before this new CA_Recovery episode starts. That
    means that timestamp-based undo will be using the wrong retrans_stamp
    (a value that is too old) when comparing incoming TS ecr values to
    retrans_stamp to see if the current fast recovery episode can be
    undone.
    
    (2) If there is a roughly minutes-long sequence of back-to-back fast
    recovery episodes, one after another (e.g. in a shallow-buffered or
    policed bottleneck), where each fast recovery successfully makes
    forward progress and recovers one window of sequence space (but leaves
    at least one retransmit in flight at the end of the recovery),
    followed by several RTOs, then the ETIMEDOUT check may be using the
    wrong retrans_stamp (a value set at the start of the first fast
    recovery in the sequence). This can cause a very premature ETIMEDOUT,
    killing the connection prematurely.
    
    This commit changes the code to zero retrans_stamp when entering fast
    recovery, when this is known to be safe (no retransmits are out in the
    network). That ensures that when starting a fast recovery episode, and
    it is safe to do so, retrans_stamp is set when we send the fast
    retransmit packet. That addresses both bug (1) and bug (2) by ensuring
    that (if no retransmits are out when we start a fast recovery) we use
    the initial fast retransmit of this fast recovery as the time value
    for undo and ETIMEDOUT calculations.
    
    This makes intuitive sense, since the start of a new fast recovery
    episode (in a scenario where no lost packets are out in the network)
    means that the connection has made forward progress since the last RTO
    or fast recovery, and we should thus "restart the clock" used for both
    undo and ETIMEDOUT logic.
    
    Note that if when we start fast recovery there *are* retransmits out
    in the network, there can still be undesirable (1)/(2) issues. For
    example, after this patch we can still have the (1) and (2) problems
    in cases like this:
    
    + round 1: sender sends flight 1
    
    + round 2: sender receives SACKs and enters fast recovery 1,
      retransmits some packets in flight 1 and then sends some new data as
      flight 2
    
    + round 3: sender receives some SACKs for flight 2, notes losses, and
      retransmits some packets to fill the holes in flight 2
    
    + fast recovery has some lost retransmits in flight 1 and continues
      for one or more rounds sending retransmits for flight 1 and flight 2
    
    + fast recovery 1 completes when snd_una reaches high_seq at end of
      flight 1
    
    + there are still holes in the SACK scoreboard in flight 2, so we
      enter fast recovery 2, but some retransmits in the flight 2 sequence
      range are still in flight (retrans_out > 0), so we can't execute the
      new retrans_stamp=0 added here to clear retrans_stamp
    
    It's not yet clear how to fix these remaining (1)/(2) issues in an
    efficient way without breaking undo behavior, given that retrans_stamp
    is currently used for undo and ETIMEDOUT. Perhaps the optimal (but
    expensive) strategy would be to set retrans_stamp to the timestamp of
    the earliest outstanding retransmit when entering fast recovery. But
    at least this commit makes things better.
    
    Note that this does not change the semantics of retrans_stamp; it
    simply makes retrans_stamp accurate in some cases where it was not
    before:
    
    (1) Some loss recovery, followed by an immediate entry into a fast
    recovery, where there are no retransmits out when entering the fast
    recovery.
    
    (2) When a TFO server has a SYNACK retransmit that sets retrans_stamp,
    and then the ACK that completes the 3-way handshake has SACK blocks
    that trigger a fast recovery. In this case when entering fast recovery
    we want to zero out the retrans_stamp from the TFO SYNACK retransmit,
    and set the retrans_stamp based on the timestamp of the fast recovery.
    
    We introduce a tcp_retrans_stamp_cleanup() helper, because this
    two-line sequence already appears in 3 places and is about to appear
    in 2 more as a result of this bug fix patch series. Once this bug fix
    patches series in the net branch makes it into the net-next branch
    we'll update the 3 other call sites to use the new helper.
    
    This is a long-standing issue. The Fixes tag below is chosen to be the
    oldest commit at which the patch will apply cleanly, which is from
    Linux v3.5 in 2012.
    
    Fixes: 1fbc340514fc ("tcp: early retransmit: tcp_enter_recovery()")
    Signed-off-by: Neal Cardwell <[email protected]>
    Signed-off-by: Yuchung Cheng <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tcp: fix TFO SYN_RECV to not zero retrans_stamp with retransmits out [+ + +]

Author: Neal Cardwell <[email protected]>
Date:   Tue Oct 1 20:05:17 2024 +0000

    tcp: fix TFO SYN_RECV to not zero retrans_stamp with retransmits out
    
    [ Upstream commit 27c80efcc20486c82698f05f00e288b44513c86b ]
    
    Fix tcp_rcv_synrecv_state_fastopen() to not zero retrans_stamp
    if retransmits are outstanding.
    
    tcp_fastopen_synack_timer() sets retrans_stamp, so typically we'll
    need to zero retrans_stamp here to prevent spurious
    retransmits_timed_out(). The logic to zero retrans_stamp is from this
    2019 commit:
    
    commit cd736d8b67fb ("tcp: fix retrans timestamp on passive Fast Open")
    
    However, in the corner case where the ACK of our TFO SYNACK carried
    some SACK blocks that caused us to enter TCP_CA_Recovery then that
    non-zero retrans_stamp corresponds to the active fast recovery, and we
    need to leave retrans_stamp with its current non-zero value, for
    correct ETIMEDOUT and undo behavior.
    
    Fixes: cd736d8b67fb ("tcp: fix retrans timestamp on passive Fast Open")
    Signed-off-by: Neal Cardwell <[email protected]>
    Signed-off-by: Yuchung Cheng <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tcp: fix to allow timestamp undo if no retransmits were sent [+ + +]

Author: Neal Cardwell <[email protected]>
Date:   Tue Oct 1 20:05:15 2024 +0000

    tcp: fix to allow timestamp undo if no retransmits were sent
    
    [ Upstream commit e37ab7373696e650d3b6262a5b882aadad69bb9e ]
    
    Fix the TCP loss recovery undo logic in tcp_packet_delayed() so that
    it can trigger undo even if TSQ prevents a fast recovery episode from
    reaching tcp_retransmit_skb().
    
    Geumhwan Yu <[email protected]> recently reported that after
    this commit from 2019:
    
    commit bc9f38c8328e ("tcp: avoid unconditional congestion window undo
    on SYN retransmit")
    
    ...and before this fix we could have buggy scenarios like the
    following:
    
    + Due to reordering, a TCP connection receives some SACKs and enters a
      spurious fast recovery.
    
    + TSQ prevents all invocations of tcp_retransmit_skb(), because many
      skbs are queued in lower layers of the sending machine's network
      stack; thus tp->retrans_stamp remains 0.
    
    + The connection receives a TCP timestamp ECR value echoing a
      timestamp before the fast recovery, indicating that the fast
      recovery was spurious.
    
    + The connection fails to undo the spurious fast recovery because
      tp->retrans_stamp is 0, and thus tcp_packet_delayed() returns false,
      due to the new logic in the 2019 commit: commit bc9f38c8328e ("tcp:
      avoid unconditional congestion window undo on SYN retransmit")
    
    This fix tweaks the logic to be more similar to the
    tcp_packet_delayed() logic before bc9f38c8328e, except that we take
    care not to be fooled by the FLAG_SYN_ACKED code path zeroing out
    tp->retrans_stamp (the bug noted and fixed by Yuchung in
    bc9f38c8328e).
    
    Note that this returns the high-level behavior of tcp_packet_delayed()
    to again match the comment for the function, which says: "Nothing was
    retransmitted or returned timestamp is less than timestamp of the
    first retransmission." Note that this comment is in the original
    2005-04-16 Linux git commit, so this is evidently long-standing
    behavior.
    
    Fixes: bc9f38c8328e ("tcp: avoid unconditional congestion window undo on SYN retransmit")
    Reported-by: Geumhwan Yu <[email protected]>
    Diagnosed-by: Geumhwan Yu <[email protected]>
    Signed-off-by: Neal Cardwell <[email protected]>
    Signed-off-by: Yuchung Cheng <[email protected]>
    Signed-off-by: Eric Dumazet <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jakub Kicinski <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tcp: new TCP_INFO stats for RTO events [+ + +]

Author: Aananth V <[email protected]>
Date:   Thu Sep 14 14:36:21 2023 +0000

    tcp: new TCP_INFO stats for RTO events
    
    [ Upstream commit 3868ab0f192581eff978501a05f3dc2e01541d77 ]
    
    The 2023 SIGCOMM paper "Improving Network Availability with Protective
    ReRoute" has indicated Linux TCP's RTO-triggered txhash rehashing can
    effectively reduce application disruption during outages. To better
    measure the efficacy of this feature, this patch adds three more
    detailed stats during RTO recovery and exports via TCP_INFO.
    Applications and monitoring systems can leverage this data to measure
    the network path diversity and end-to-end repair latency during network
    outages to improve their network infrastructure.
    
    The following counters are added to tcp_sock in order to track RTO
    events over the lifetime of a TCP socket.
    
    1. u16 total_rto - Counts the total number of RTO timeouts.
    2. u16 total_rto_recoveries - Counts the total number of RTO recoveries.
    3. u32 total_rto_time - Counts the total time spent (ms) in RTO
                            recoveries. (time spent in CA_Loss and
                            CA_Recovery states)
    
    To compute total_rto_time, we add a new u32 rto_stamp field to
    tcp_sock. rto_stamp records the start timestamp (ms) of the last RTO
    recovery (CA_Loss).
    
    Corresponding fields are also added to the tcp_info struct.
    
    Signed-off-by: Aananth V <[email protected]>
    Signed-off-by: Neal Cardwell <[email protected]>
    Signed-off-by: Yuchung Cheng <[email protected]>
    Reviewed-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>
    Stable-dep-of: 27c80efcc204 ("tcp: fix TFO SYN_RECV to not zero retrans_stamp with retransmits out")
    Signed-off-by: Sasha Levin <[email protected]>

thermal: int340x: processor_thermal: Set feature mask before proc_thermal_add [+ + +]

Author: Srinivas Pandruvada <[email protected]>
Date:   Mon Oct 9 12:05:34 2023 -0700

    thermal: int340x: processor_thermal: Set feature mask before proc_thermal_add
    
    [ Upstream commit 6ebc25d8b053a208786295bab58abbb66b39c318 ]
    
    The function proc_thermal_add() adds sysfs entries for power limits.
    
    The feature mask of available features is not present at that time, so
    it cannot be used by proc_thermal_add() to selectively create sysfs
    attributes.
    
    The feature mask is set by proc_thermal_mmio_add(), so modify the code
    to call it before proc_thermal_add() so as to allow the latter to use
    the feature mask.
    
    There is no functional impact with this change.
    
    Signed-off-by: Srinivas Pandruvada <[email protected]>
    [ rjw: Changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Stable-dep-of: 99ca0b57e49f ("thermal: intel: int340x: processor: Fix warning during module unload")
    Signed-off-by: Sasha Levin <[email protected]>

thermal: intel: int340x: processor: Fix warning during module unload [+ + +]

Author: Zhang Rui <[email protected]>
Date:   Mon Sep 30 16:17:57 2024 +0800

    thermal: intel: int340x: processor: Fix warning during module unload
    
    [ Upstream commit 99ca0b57e49fb73624eede1c4396d9e3d10ccf14 ]
    
    The processor_thermal driver uses pcim_device_enable() to enable a PCI
    device, which means the device will be automatically disabled on driver
    detach.  Thus there is no need to call pci_disable_device() again on it.
    
    With recent PCI device resource management improvements, e.g. commit
    f748a07a0b64 ("PCI: Remove legacy pcim_release()"), this problem is
    exposed and triggers the warining below.
    
     [  224.010735] proc_thermal_pci 0000:00:04.0: disabling already-disabled device
     [  224.010747] WARNING: CPU: 8 PID: 4442 at drivers/pci/pci.c:2250 pci_disable_device+0xe5/0x100
     ...
     [  224.010844] Call Trace:
     [  224.010845]  <TASK>
     [  224.010847]  ? show_regs+0x6d/0x80
     [  224.010851]  ? __warn+0x8c/0x140
     [  224.010854]  ? pci_disable_device+0xe5/0x100
     [  224.010856]  ? report_bug+0x1c9/0x1e0
     [  224.010859]  ? handle_bug+0x46/0x80
     [  224.010862]  ? exc_invalid_op+0x1d/0x80
     [  224.010863]  ? asm_exc_invalid_op+0x1f/0x30
     [  224.010867]  ? pci_disable_device+0xe5/0x100
     [  224.010869]  ? pci_disable_device+0xe5/0x100
     [  224.010871]  ? kfree+0x21a/0x2b0
     [  224.010873]  pcim_disable_device+0x20/0x30
     [  224.010875]  devm_action_release+0x16/0x20
     [  224.010878]  release_nodes+0x47/0xc0
     [  224.010880]  devres_release_all+0x9f/0xe0
     [  224.010883]  device_unbind_cleanup+0x12/0x80
     [  224.010885]  device_release_driver_internal+0x1ca/0x210
     [  224.010887]  driver_detach+0x4e/0xa0
     [  224.010889]  bus_remove_driver+0x6f/0xf0
     [  224.010890]  driver_unregister+0x35/0x60
     [  224.010892]  pci_unregister_driver+0x44/0x90
     [  224.010894]  proc_thermal_pci_driver_exit+0x14/0x5f0 [processor_thermal_device_pci]
     ...
     [  224.010921] ---[ end trace 0000000000000000 ]---
    
    Remove the excess pci_disable_device() calls.
    
    Fixes: acd65d5d1cf4 ("thermal/drivers/int340x/processor_thermal: Add PCI MMIO based thermal driver")
    Signed-off-by: Zhang Rui <[email protected]>
    Reviewed-by: Srinivas Pandruvada <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    [ rjw: Subject and changelog edits ]
    Signed-off-by: Rafael J. Wysocki <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tools/iio: Add memory allocation failure check for trigger_name [+ + +]

Author: Zhu Jun <[email protected]>
Date:   Wed Aug 28 02:31:29 2024 -0700

    tools/iio: Add memory allocation failure check for trigger_name
    
    [ Upstream commit 3c6b818b097dd6932859bcc3d6722a74ec5931c1 ]
    
    Added a check to handle memory allocation failure for `trigger_name`
    and return `-ENOMEM`.
    
    Signed-off-by: Zhu Jun <[email protected]>
    Link: https://patch.msgid.link/[email protected]
    Signed-off-by: Jonathan Cameron <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tracing: Have saved_cmdlines arrays all in one allocation [+ + +]

Author: Steven Rostedt (Google) <[email protected]>
Date:   Tue Feb 20 09:06:14 2024 -0500

    tracing: Have saved_cmdlines arrays all in one allocation
    
    [ Upstream commit 0b18c852cc6fb8284ac0ab97e3e840974a6a8a64 ]
    
    The saved_cmdlines have three arrays for mapping PIDs to COMMs:
    
     - map_pid_to_cmdline[]
     - map_cmdline_to_pid[]
     - saved_cmdlines
    
    The map_pid_to_cmdline[] is PID_MAX_DEFAULT in size and holds the index
    into the other arrays. The map_cmdline_to_pid[] is a mapping back to the
    full pid as it can be larger than PID_MAX_DEFAULT. And the
    saved_cmdlines[] just holds the COMMs associated to the pids.
    
    Currently the map_pid_to_cmdline[] and saved_cmdlines[] are allocated
    together (in reality the saved_cmdlines is just in the memory of the
    rounding of the allocation of the structure as it is always allocated in
    powers of two). The map_cmdline_to_pid[] array is allocated separately.
    
    Since the rounding to a power of two is rather large (it allows for 8000
    elements in saved_cmdlines), also include the map_cmdline_to_pid[] array.
    (This drops it to 6000 by default, which is still plenty for most use
    cases). This saves even more memory as the map_cmdline_to_pid[] array
    doesn't need to be allocated.
    
    Link: https://lore.kernel.org/linux-trace-kernel/[email protected]/
    Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
    
    Cc: Mark Rutland <[email protected]>
    Cc: Mathieu Desnoyers <[email protected]>
    Cc: Andrew Morton <[email protected]>
    Cc: Tim Chen <[email protected]>
    Cc: Vincent Donnefort <[email protected]>
    Cc: Sven Schnelle <[email protected]>
    Cc: Mete Durlu <[email protected]>
    Fixes: 44dc5c41b5b1 ("tracing: Fix wasted memory in saved_cmdlines logic")
    Acked-by: Masami Hiramatsu (Google) <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

tracing: Remove precision vsnprintf() check from print event [+ + +]

Author: Steven Rostedt (Google) <[email protected]>
Date:   Mon Mar 4 17:43:41 2024 -0500

    tracing: Remove precision vsnprintf() check from print event
    
    [ Upstream commit 5efd3e2aef91d2d812290dcb25b2058e6f3f532c ]
    
    This reverts 60be76eeabb3d ("tracing: Add size check when printing
    trace_marker output"). The only reason the precision check was added
    was because of a bug that miscalculated the write size of the string into
    the ring buffer and it truncated it removing the terminating nul byte. On
    reading the trace it crashed the kernel. But this was due to the bug in
    the code that happened during development and should never happen in
    practice. If anything, the precision can hide bugs where the string in the
    ring buffer isn't nul terminated and it will not be checked.
    
    Link: https://lore.kernel.org/all/[email protected]/
    Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
    Link: https://lore.kernel.org/all/[email protected]/
    Link: https://lore.kernel.org/linux-trace-kernel/[email protected]
    
    Cc: Masami Hiramatsu <[email protected]>
    Cc: Linus Torvalds <[email protected]>
    Fixes: 60be76eeabb3d ("tracing: Add size check when printing trace_marker output")
    Reported-by: Sachin Sant <[email protected]>
    Tested-by: Sachin Sant <[email protected]>
    Reviewed-by: Mathieu Desnoyers <[email protected]>
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

unicode: Don't special case ignorable code points [+ + +]

Author: Gabriel Krisman Bertazi <[email protected]>
Date:   Tue Oct 8 18:43:16 2024 -0400

    unicode: Don't special case ignorable code points
    
    commit 5c26d2f1d3f5e4be3e196526bead29ecb139cf91 upstream.
    
    We don't need to handle them separately. Instead, just let them
    decompose/casefold to themselves.
    
    Signed-off-by: Gabriel Krisman Bertazi <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: chipidea: udc: enable suspend interrupt after usb reset [+ + +]

Author: Xu Yang <[email protected]>
Date:   Fri Aug 23 15:38:32 2024 +0800

    usb: chipidea: udc: enable suspend interrupt after usb reset
    
    [ Upstream commit e4fdcc10092fb244218013bfe8ff01c55d54e8e4 ]
    
    Currently, suspend interrupt is enabled before pullup enable operation.
    This will cause a suspend interrupt assert right after pullup DP. This
    suspend interrupt is meaningless, so this will ignore such interrupt
    by enable it after usb reset completed.
    
    Signed-off-by: Xu Yang <[email protected]>
    Acked-by: Peter Chen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

usb: dwc2: Adjust the timing of USB Driver Interrupt Registration in the Crashkernel Scenario [+ + +]

Author: Shawn Shao <[email protected]>
Date:   Fri Aug 30 11:17:09 2024 +0800

    usb: dwc2: Adjust the timing of USB Driver Interrupt Registration in the Crashkernel Scenario
    
    [ Upstream commit 4058c39bd176daf11a826802d940d86292a6b02b ]
    
    The issue is that before entering the crash kernel, the DWC USB controller
    did not perform operations such as resetting the interrupt mask bits.
    After entering the crash kernel,before the USB interrupt handler
    registration was completed while loading the DWC USB driver,an GINTSTS_SOF
    interrupt was received.This triggered the misroute_irq process within the
    GIC handling framework,ultimately leading to the misrouting of the
    interrupt,causing it to be handled by the wrong interrupt handler
    and resulting in the issue.
    
    Summary:In a scenario where the kernel triggers a panic and enters
    the crash kernel,it is necessary to ensure that the interrupt mask
    bit is not enabled before the interrupt registration is complete.
    If an interrupt reaches the CPU at this moment,it will certainly
    not be handled correctly,especially in cases where this interrupt
    is reported frequently.
    
    Please refer to the Crashkernel dmesg information as follows
    (the message on line 3 was added before devm_request_irq is
    called by the dwc2_driver_probe function):
    [    5.866837][    T1] dwc2 JMIC0010:01: supply vusb_d not found, using dummy regulator
    [    5.874588][    T1] dwc2 JMIC0010:01: supply vusb_a not found, using dummy regulator
    [    5.882335][    T1] dwc2 JMIC0010:01: before devm_request_irq  irq: [71], gintmsk[0xf300080e], gintsts[0x04200009]
    [    5.892686][    C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-jmnd1.2_RC #18
    [    5.900327][    C0] Hardware name: CMSS HyperCard4-25G/HyperCard4-25G, BIOS 1.6.4 Jul  8 2024
    [    5.908836][    C0] Call trace:
    [    5.911965][    C0]  dump_backtrace+0x0/0x1f0
    [    5.916308][    C0]  show_stack+0x20/0x30
    [    5.920304][    C0]  dump_stack+0xd8/0x140
    [    5.924387][    C0]  pcie_xxx_handler+0x3c/0x1d8
    [    5.930121][    C0]  __handle_irq_event_percpu+0x64/0x1e0
    [    5.935506][    C0]  handle_irq_event+0x80/0x1d0
    [    5.940109][    C0]  try_one_irq+0x138/0x174
    [    5.944365][    C0]  misrouted_irq+0x134/0x140
    [    5.948795][    C0]  note_interrupt+0x1d0/0x30c
    [    5.953311][    C0]  handle_irq_event+0x13c/0x1d0
    [    5.958001][    C0]  handle_fasteoi_irq+0xd4/0x260
    [    5.962779][    C0]  __handle_domain_irq+0x88/0xf0
    [    5.967555][    C0]  gic_handle_irq+0x9c/0x2f0
    [    5.971985][    C0]  el1_irq+0xb8/0x140
    [    5.975807][    C0]  __setup_irq+0x3dc/0x7cc
    [    5.980064][    C0]  request_threaded_irq+0xf4/0x1b4
    [    5.985015][    C0]  devm_request_threaded_irq+0x80/0x100
    [    5.990400][    C0]  dwc2_driver_probe+0x1b8/0x6b0
    [    5.995178][    C0]  platform_drv_probe+0x5c/0xb0
    [    5.999868][    C0]  really_probe+0xf8/0x51c
    [    6.004125][    C0]  driver_probe_device+0xfc/0x170
    [    6.008989][    C0]  device_driver_attach+0xc8/0xd0
    [    6.013853][    C0]  __driver_attach+0xe8/0x1b0
    [    6.018369][    C0]  bus_for_each_dev+0x7c/0xdc
    [    6.022886][    C0]  driver_attach+0x2c/0x3c
    [    6.027143][    C0]  bus_add_driver+0xdc/0x240
    [    6.031573][    C0]  driver_register+0x80/0x13c
    [    6.036090][    C0]  __platform_driver_register+0x50/0x5c
    [    6.041476][    C0]  dwc2_platform_driver_init+0x24/0x30
    [    6.046774][    C0]  do_one_initcall+0x50/0x25c
    [    6.051291][    C0]  do_initcall_level+0xe4/0xfc
    [    6.055894][    C0]  do_initcalls+0x80/0xa4
    [    6.060064][    C0]  kernel_init_freeable+0x198/0x240
    [    6.065102][    C0]  kernel_init+0x1c/0x12c
    
    Signed-off-by: Shawn Shao <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

usb: dwc3: core: Stop processing of pending events if controller is halted [+ + +]

Author: Selvarasu Ganesan <[email protected]>
Date:   Tue Sep 17 04:48:09 2024 +0530

    usb: dwc3: core: Stop processing of pending events if controller is halted
    
    commit 0d410e8913f5cffebcca79ffdd596009d4a13a28 upstream.
    
    This commit addresses an issue where events were being processed when
    the controller was in a halted state. To fix this issue by stop
    processing the events as the event count was considered stale or
    invalid when the controller was halted.
    
    Fixes: fc8bb91bc83e ("usb: dwc3: implement runtime PM")
    Cc: [email protected]
    Signed-off-by: Selvarasu Ganesan <[email protected]>
    Suggested-by: Thinh Nguyen <[email protected]>
    Acked-by: Thinh Nguyen <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: gadget: core: force synchronous registration [+ + +]

Author: John Keeping <[email protected]>
Date:   Fri Sep 13 11:23:23 2024 +0100

    usb: gadget: core: force synchronous registration
    
    commit df9158826b00e53f42c67d62c887a84490d80a0a upstream.
    
    Registering a gadget driver is expected to complete synchronously and
    immediately after calling driver_register() this function checks that
    the driver has bound so as to return an error.
    
    Set PROBE_FORCE_SYNCHRONOUS to ensure this is the case even when
    asynchronous probing is set as the default.
    
    Fixes: fc274c1e99731 ("USB: gadget: Add a new bus for gadgets")
    Cc: [email protected]
    Signed-off-by: John Keeping <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: storage: ignore bogus device raised by JieLi BR21 USB sound chip [+ + +]

Author: Icenowy Zheng <[email protected]>
Date:   Tue Oct 1 16:34:07 2024 +0800

    usb: storage: ignore bogus device raised by JieLi BR21 USB sound chip
    
    commit a6555cb1cb69db479d0760e392c175ba32426842 upstream.
    
    JieLi tends to use SCSI via USB Mass Storage to implement their own
    proprietary commands instead of implementing another USB interface.
    Enumerating it as a generic mass storage device will lead to a Hardware
    Error sense key get reported.
    
    Ignore this bogus device to prevent appearing a unusable sdX device
    file.
    
    Signed-off-by: Icenowy Zheng <[email protected]>
    Cc: stable <[email protected]>
    Acked-by: Alan Stern <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

usb: typec: tipd: Free IRQ only if it was requested before [+ + +]

Author: Wadim Egorov <[email protected]>
Date:   Fri Aug 16 14:41:50 2024 +0200

    usb: typec: tipd: Free IRQ only if it was requested before
    
    [ Upstream commit db63d9868f7f310de44ba7bea584e2454f8b4ed0 ]
    
    In polling mode, if no IRQ was requested there is no need to free it.
    Call devm_free_irq() only if client->irq is set. This fixes the warning
    caused by the tps6598x module removal:
    
    WARNING: CPU: 2 PID: 333 at kernel/irq/devres.c:144 devm_free_irq+0x80/0x8c
    ...
    ...
    Call trace:
      devm_free_irq+0x80/0x8c
      tps6598x_remove+0x28/0x88 [tps6598x]
      i2c_device_remove+0x2c/0x9c
      device_remove+0x4c/0x80
      device_release_driver_internal+0x1cc/0x228
      driver_detach+0x50/0x98
      bus_remove_driver+0x6c/0xbc
      driver_unregister+0x30/0x60
      i2c_del_driver+0x54/0x64
      tps6598x_i2c_driver_exit+0x18/0xc3c [tps6598x]
      __arm64_sys_delete_module+0x184/0x264
      invoke_syscall+0x48/0x110
      el0_svc_common.constprop.0+0xc8/0xe8
      do_el0_svc+0x20/0x2c
      el0_svc+0x28/0x98
      el0t_64_sync_handler+0x13c/0x158
      el0t_64_sync+0x190/0x194
    
    Signed-off-by: Wadim Egorov <[email protected]>
    Reviewed-by: Heikki Krogerus <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

usb: xhci: Fix problem with xhci resume from suspend [+ + +]

Author: Jose Alberto Reguero <[email protected]>
Date:   Thu Sep 19 20:42:02 2024 +0200

    usb: xhci: Fix problem with xhci resume from suspend
    
    commit d44238d8254a36249d576c96473269dbe500f5e4 upstream.
    
    I have a ASUS PN51 S mini pc that has two xhci devices. One from AMD,
    and other from ASMEDIA. The one from ASMEDIA have problems when resume
    from suspend, and keep broken until unplug the  power cord. I use this
    kernel parameter: xhci-hcd.quirks=128 and then it works ok. I make a
    path to reset only the ASMEDIA xhci.
    
    Signed-off-by: Jose Alberto Reguero <[email protected]>
    Cc: stable <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

virtio_console: fix misc probe bugs [+ + +]

Author: Michael S. Tsirkin <[email protected]>
Date:   Mon Sep 16 14:16:44 2024 -0400

    virtio_console: fix misc probe bugs
    
    [ Upstream commit b9efbe2b8f0177fa97bfab290d60858900aa196b ]
    
    This fixes the following issue discovered by code review:
    
    after vqs have been created, a buggy device can send an interrupt.
    
    A control vq callback will then try to schedule control_work which has
    not been initialized yet. Similarly for config interrupt.  Further, in
    and out vq callbacks invoke find_port_by_vq which attempts to take
    ports_lock which also has not been initialized.
    
    To fix, init all locks and work before creating vqs.
    
    Message-ID: <ad982e975a6160ad110c623c016041311ca15b4f.1726511547.git.mst@redhat.com>
    Fixes: 17634ba25544 ("virtio: console: Add a new MULTIPORT feature, support for generic ports")
    Signed-off-by: Michael S. Tsirkin <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

virtio_pmem: Check device status before requesting flush [+ + +]

Author: Philip Chen <[email protected]>
Date:   Mon Aug 26 21:53:13 2024 +0000

    virtio_pmem: Check device status before requesting flush
    
    [ Upstream commit e25fbcd97cf52c3c9824d44b5c56c19673c3dd50 ]
    
    If a pmem device is in a bad status, the driver side could wait for
    host ack forever in virtio_pmem_flush(), causing the system to hang.
    
    So add a status check in the beginning of virtio_pmem_flush() to return
    early if the device is not activated.
    
    Signed-off-by: Philip Chen <[email protected]>
    Message-Id: <[email protected]>
    Signed-off-by: Michael S. Tsirkin <[email protected]>
    Acked-by: Pankaj Gupta <[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

vxlan: Handle error of rtnl_register_module(). [+ + +]

Author: Kuniyuki Iwashima <[email protected]>
Date:   Tue Oct 8 11:47:33 2024 -0700

    vxlan: Handle error of rtnl_register_module().
    
    [ Upstream commit 78b7b991838a4a6baeaad934addc4db2c5917eb8 ]
    
    Since introduced, vxlan_vnifilter_init() has been ignoring the
    returned value of rtnl_register_module(), which could fail silently.
    
    Handling the error allows users to view a module as an all-or-nothing
    thing in terms of the rtnetlink functionality.  This prevents syzkaller
    from reporting spurious errors from its tests, where OOM often occurs
    and module is automatically loaded.
    
    Let's handle the errors by rtnl_register_many().
    
    Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device")
    Signed-off-by: Kuniyuki Iwashima <[email protected]>
    Reviewed-by: Nikolay Aleksandrov <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

wifi: mac80211: Avoid address calculations via out of bounds array indexing [+ + +]

Author: Kenton Groombridge <[email protected]>
Date:   Thu Oct 10 14:50:09 2024 +0800

    wifi: mac80211: Avoid address calculations via out of bounds array indexing
    
    [ Upstream commit 2663d0462eb32ae7c9b035300ab6b1523886c718 ]
    
    req->n_channels must be set before req->channels[] can be used.
    
    This patch fixes one of the issues encountered in [1].
    
    [   83.964255] UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:364:4
    [   83.964258] index 0 is out of range for type 'struct ieee80211_channel *[]'
    [...]
    [   83.964264] Call Trace:
    [   83.964267]  <TASK>
    [   83.964269]  dump_stack_lvl+0x3f/0xc0
    [   83.964274]  __ubsan_handle_out_of_bounds+0xec/0x110
    [   83.964278]  ieee80211_prep_hw_scan+0x2db/0x4b0
    [   83.964281]  __ieee80211_start_scan+0x601/0x990
    [   83.964291]  nl80211_trigger_scan+0x874/0x980
    [   83.964295]  genl_family_rcv_msg_doit+0xe8/0x160
    [   83.964298]  genl_rcv_msg+0x240/0x270
    [...]
    
    [1] https://bugzilla.kernel.org/show_bug.cgi?id=218810
    
    Co-authored-by: Kees Cook <[email protected]>
    Signed-off-by: Kees Cook <[email protected]>
    Signed-off-by: Kenton Groombridge <[email protected]>
    Link: https://msgid.link/[email protected]
    Signed-off-by: Johannes Berg <[email protected]>
    [Xiangyu: Modified to apply on 6.1.y and 6.6.y]
    Signed-off-by: Xiangyu Chen <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

x86/amd_nb: Add new PCI IDs for AMD family 0x1a [+ + +]

Author: Shyam Sundar S K <[email protected]>
Date:   Fri May 10 16:48:28 2024 +0530

    x86/amd_nb: Add new PCI IDs for AMD family 0x1a
    
    [ Upstream commit 0e640f0a47d8426eab1fb9c03f0af898dfe810b8 ]
    
    Add the new PCI Device IDs to the MISC IDs list to support new
    generation of AMD 1Ah family 70h Models of processors.
    
      [ bp: Massage commit message. ]
    
    Signed-off-by: Shyam Sundar S K <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Stable-dep-of: 59c34008d3bd ("x86/amd_nb: Add new PCI IDs for AMD family 1Ah model 60h")
    Signed-off-by: Sasha Levin <[email protected]>

x86/amd_nb: Add new PCI IDs for AMD family 1Ah model 60h [+ + +]

Author: Shyam Sundar S K <[email protected]>
Date:   Mon Jul 22 14:58:01 2024 +0530

    x86/amd_nb: Add new PCI IDs for AMD family 1Ah model 60h
    
    [ Upstream commit 59c34008d3bdeef4c8ebc0ed2426109b474334d4 ]
    
    Add new PCI device IDs into the root IDs and miscellaneous IDs lists to
    provide support for the latest generation of AMD 1Ah family 60h processor
    models.
    
    Signed-off-by: Shyam Sundar S K <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Reviewed-by: Yazen Ghannam <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Sasha Levin <[email protected]>

zram: don't free statically defined names [+ + +]

Author: Andrey Skvortsov <[email protected]>
Date:   Wed Oct 9 13:51:40 2024 +0900

    zram: don't free statically defined names
    
    [ Upstream commit 486fd58af7ac1098b68370b1d4d9f94a2a1c7124 ]
    
    When CONFIG_ZRAM_MULTI_COMP isn't set ZRAM_SECONDARY_COMP can hold
    default_compressor, because it's the same offset as ZRAM_PRIMARY_COMP, so
    we need to make sure that we don't attempt to kfree() the statically
    defined compressor name.
    
    This is detected by KASAN.
    
    ==================================================================
      Call trace:
       kfree+0x60/0x3a0
       zram_destroy_comps+0x98/0x198 [zram]
       zram_reset_device+0x22c/0x4a8 [zram]
       reset_store+0x1bc/0x2d8 [zram]
       dev_attr_store+0x44/0x80
       sysfs_kf_write+0xfc/0x188
       kernfs_fop_write_iter+0x28c/0x428
       vfs_write+0x4dc/0x9b8
       ksys_write+0x100/0x1f8
       __arm64_sys_write+0x74/0xb8
       invoke_syscall+0xd8/0x260
       el0_svc_common.constprop.0+0xb4/0x240
       do_el0_svc+0x48/0x68
       el0_svc+0x40/0xc8
       el0t_64_sync_handler+0x120/0x130
       el0t_64_sync+0x190/0x198
    ==================================================================
    
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 684826f8271a ("zram: free secondary algorithms names")
    Signed-off-by: Andrey Skvortsov <[email protected]>
    Reviewed-by: Sergey Senozhatsky <[email protected]>
    Reported-by: Venkat Rao Bagalkote <[email protected]>
    Closes: https://lore.kernel.org/lkml/[email protected]/
    Tested-by: Venkat Rao Bagalkote <[email protected]>
    Cc: Christophe JAILLET <[email protected]>
    Cc: Jens Axboe <[email protected]>
    Cc: Minchan Kim <[email protected]>
    Cc: Sergey Senozhatsky <[email protected]>
    Cc: Venkat Rao Bagalkote <[email protected]>
    Cc: Chris Li <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>

zram: free secondary algorithms names [+ + +]

Author: Sergey Senozhatsky <[email protected]>
Date:   Wed Oct 9 13:51:39 2024 +0900

    zram: free secondary algorithms names
    
    [ Upstream commit 684826f8271ad97580b138b9ffd462005e470b99 ]
    
    We need to kfree() secondary algorithms names when reset zram device that
    had multi-streams, otherwise we leak memory.
    
    [[email protected]: kfree(NULL) is legal]
      Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 001d92735701 ("zram: add recompression algorithm sysfs knob")
    Signed-off-by: Sergey Senozhatsky <[email protected]>
    Cc: Minchan Kim <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>
    Signed-off-by: Sasha Levin <[email protected]>