Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very large PR: 1k changed files, 70k changed lines #3

Open
wants to merge 1,149 commits into
base: branch-v6.3-rc1
Choose a base branch
from

Conversation

fahhem
Copy link

@fahhem fahhem commented Apr 1, 2023

This change is Reviewable

torvalds and others added 30 commits March 19, 2023 09:38
…ux/kernel/git/tytso/ext4

Pull ext4 fix from Ted Ts'o:
 "Fix a double unlock bug on an error path in ext4, found by smatch and
  syzkaller"

* tag 'ext4_for_linus_urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix possible double unlock when moving a directory
…inux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
 "There's a little bit more 'movement' in there for my taste but it
  needs to happen and should make the code better after it.

   - Check cmdline_find_option()'s return value before further
     processing

   - Clear temporary storage in the resctrl code to prevent access to an
     unexistent MSR

   - Add a simple throttling mechanism to protect the hypervisor from
     potentially malicious SEV guests issuing requests in rapid
     succession.

     In order to not jeopardize the sanity of everyone involved in
     maintaining this code, the request issuing side has received a
     cleanup, split in more or less trivial, small and digestible
     pieces. Otherwise, the code was threatening to become an
     unmaintainable mess.

     Therefore, that cleanup is marked indirectly also for stable so
     that there's no differences between the upstream code and the
     stable variant when it comes down to backporting more there"

* tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix use of uninitialized buffer in sme_enable()
  x86/resctrl: Clear staged_config[] before and after it is used
  virt/coco/sev-guest: Add throttling awareness
  virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
  virt/coco/sev-guest: Do some code style cleanups
  virt/coco/sev-guest: Carve out the request issuing logic into a helper
  virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()
  virt/coco/sev-guest: Simplify extended guest request handling
  virt/coco/sev-guest: Check SEV_SNP attribute at probe time
…linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Check whether sibling events have been deactivated before adding them
   to groups

 - Update the proper event time tracking variable depending on the event
   type

 - Fix a memory overwrite issue due to using the wrong function argument
   when outputting perf events

* tag 'perf_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix check before add_event_to_groups() in perf_group_detach()
  perf: fix perf_event_context->time
  perf/core: Fix perf_output_begin parameter is incorrectly invoked in perf_event_bpf_output
Callers of xfs_alloc_vextent_iterate_ags that pass in the TRYLOCK flag
want us to perform a non-blocking scan of the AGs for free space.  There
are no ordering constraints for non-blocking AGF lock acquisition, so
the scan can freely start over at AG 0 even when minimum_agno > 0.

This manifests fairly reliably on xfs/294 on 6.3-rc2 with the parent
pointer patchset applied and the realtime volume enabled.  I observed
the following sequence as part of an xfs_dir_createname call:

0. Fragment the free space, then allocate nearly all the free space in
   all AGs except AG 0.

1. Create a directory in AG 2 and let it grow for a while.

2. Try to allocate 2 blocks to expand the dirent part of a directory.
   The space will be allocated out of AG 0, but the allocation will not
   be contiguous.  This (I think) activates the LOWMODE allocator.

3. The bmapi call decides to convert from extents to bmbt format and
   tries to allocate 1 block.  This allocation request calls
   xfs_alloc_vextent_start_ag with the inode number, which starts the
   scan at AG 2.  We ignore AG 0 (with all its free space) and instead
   scrape AG 2 and 3 for more space.  We find one block, but this now
   kicks t_highest_agno to 3.

4. The createname call decides it needs to split the dabtree.  It tries
   to allocate even more space with xfs_alloc_vextent_start_ag, but now
   we're constrained to AG 3, and we don't find the space.  The
   createname returns ENOSPC and the filesystem shuts down.

This change fixes the problem by making the trylock scan wrap around to
AG 0 if it doesn't like the AGs that it finds.  Since the current
transaction itself holds AGF 0, the trylock of AGF 0 will succeed, and
we take space from the AG that has plenty.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
There are now five separate space allocator interfaces exposed to the
rest of XFS for five different strategies to find space.  Add
tracepoints for each of them so that I can tell from a trace dump
exactly which ones got called and what happened underneath them.  Add a
sixth so it's more obvious if an allocation actually happened.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
Back in the 6.2-rc1 days, Eric Whitney reported a fstests regression in
ext4 against generic/454.  The cause of this test failure was the
unfortunate combination of setting an xattr name containing UTF8 encoded
emoji, an xattr hash function that accepted a char pointer with no
explicit signedness, signed type extension of those chars to an int, and
the 6.2 build tools maintainers deciding to mandate -funsigned-char
across the board.  As a result, the ondisk extended attribute structure
written out by 6.1 and 6.2 were not the same.

This discrepancy, in fact, had been noticeable if a filesystem with such
an xattr were moved between any two architectures that don't employ the
same signedness of a raw "char" declaration.  The only reason anyone
noticed is that x86 gcc defaults to signed, and no such -funsigned-char
update was made to e2fsprogs, so e2fsck immediately started reporting
data corruption.

After a day and a half of discussing how to handle this use case (xattrs
with bit 7 set anywhere in the name) without breaking existing users,
Linus merged his own patch and didn't tell the maintainer.  None of the
ext4 developers realized this until AUTOSEL announced that the commit
had been backported to stable.

In the end, this problem could have been detected much earlier if there
had been any useful tests of hash function(s) in use inside ext4 to make
sure that they always produce the same outputs given the same inputs.

The XFS dirent/xattr name hash takes a uint8_t*, so I don't think it's
vulnerable to this problem.  However, let's avoid all this drama by
adding our own self test to check that the da hash produces the same
outputs for a static pile of inputs on various platforms.  This enables
us to fix any breakage that may result in a controlled fashion.  The
buffer and test data are identical to the patches submitted to xfsprogs.

Link: https://lore.kernel.org/linux-ext4/Y8bpkm3jA3bDm3eL@debian-BULLSEYE-live-builder-AMD64/
Link: https://lore.kernel.org/linux-xfs/ZBUKCRR7xvIqPrpX@destitution/T/#md38272cc684e2c0d61494435ccbb91f022e8dee4
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>
…inux/kernel/git/tip/tip

Pull RAS fix from Borislav Petkov:

 - Flush out logged errors immediately after MCA banks configuration
   changes over sysfs have been done instead of waiting until something
   else triggers the workqueue later - another error or the polling
   interval cycle is reached

* tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Make sure logged MCEs are processed after sysfs update
Equivalent of for_each_cpu_and, except it ORs the two masks together
so it iterates all the CPUs present in either mask.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
In commit f689054 ("percpu_counter: add percpu_counter_sum_all
interface") a race condition between a cpu dying and
percpu_counter_sum() iterating online CPUs was identified. The
solution was to iterate all possible CPUs for summation via
percpu_counter_sum_all().

We recently had a percpu_counter_sum() call in XFS trip over this
same race condition and it fired a debug assert because the
filesystem was unmounting and the counter *should* be zero just
before we destroy it. That was reported here:

https://lore.kernel.org/linux-kernel/[email protected]/

likely as a result of running generic/648 which exercises
filesystems in the presence of CPU online/offline events.

The solution to use percpu_counter_sum_all() is an awful one. We
use percpu counters and percpu_counter_sum() for accurate and
reliable threshold detection for space management, so a summation
race condition during these operations can result in overcommit of
available space and that may result in filesystem shutdowns.

As percpu_counter_sum_all() iterates all possible CPUs rather than
just those online or even those present, the mask can include CPUs
that aren't even installed in the machine, or in the case of
machines that can hot-plug CPU capable nodes, even have physical
sockets present in the machine.

Fundamentally, this race condition is caused by the CPU being
offlined being removed from the cpu_online_mask before the notifier
that cleans up per-cpu state is run. Hence percpu_counter_sum() will
not sum the count for a cpu currently being taken offline,
regardless of whether the notifier has run or not. This is
the root cause of the bug.

The percpu counter notifier iterates all the registered counters,
locks the counter and moves the percpu count to the global sum.
This is serialised against other operations that move the percpu
counter to the global sum as well as percpu_counter_sum() operations
that sum the percpu counts while holding the counter lock.

Hence the notifier is safe to run concurrently with sum operations,
and the only thing we actually need to care about is that
percpu_counter_sum() iterates dying CPUs. That's trivial to do,
and when there are no CPUs dying, it has no addition overhead except
for a cpumask_or() operation.

This change makes percpu_counter_sum() always do the right thing in
the presence of CPU hot unplug events and makes
percpu_counter_sum_all() unnecessary. This, in turn, means that
filesystems like XFS, ext4, and btrfs don't have to work out when
they should use percpu_counter_sum() vs percpu_counter_sum_all() in
their space accounting algorithms

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
This effectively reverts the change made in commit f689054
("percpu_counter: add percpu_counter_sum_all interface") as the
race condition percpu_counter_sum_all() was invented to avoid is
now handled directly in percpu_counter_sum() and nobody needs to
care about summing racing with cpu unplug anymore.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
percpu_counter_sum_all() is now redundant as the race condition it
was invented to handle is now dealt with by percpu_counter_sum()
directly and all users of percpu_counter_sum_all() have been
removed.

Remove it.

This effectively reverts the changes made in f689054
("percpu_counter: add percpu_counter_sum_all interface") except for
the cpumask iteration that fixes percpu_counter_sum() made earlier
in this series.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>
…ernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are a few small char/misc/other driver subsystem patches to
  resolve reported problems for 6.3-rc3.

  Included in here are:

   - Interconnect driver fixes for reported problems

   - Memory driver fixes for reported problems

   - nvmem core fix

   - firmware driver fix for reported problem

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (23 commits)
  memory: tegra30-emc: fix interconnect registration race
  memory: tegra20-emc: fix interconnect registration race
  memory: tegra124-emc: fix interconnect registration race
  memory: tegra: fix interconnect registration race
  interconnect: exynos: drop redundant link destroy
  interconnect: exynos: fix registration race
  interconnect: exynos: fix node leak in probe PM QoS error path
  interconnect: qcom: msm8974: fix registration race
  interconnect: qcom: rpmh: fix registration race
  interconnect: qcom: rpmh: fix probe child-node error handling
  interconnect: qcom: rpm: fix registration race
  nvmem: core: return -ENOENT if nvmem cell is not found
  firmware: xilinx: don't make a sleepable memory allocation from an atomic context
  interconnect: qcom: rpm: fix probe child-node error handling
  interconnect: qcom: osm-l3: fix registration race
  interconnect: imx: fix registration race
  interconnect: fix provider registration API
  interconnect: fix icc_provider_del() error handling
  interconnect: fix mem leak when freeing nodes
  interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
  ...
…git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are some small tty and serial driver fixes for 6.3-rc3 to resolve
  some reported issues.

  They include:

   - 8250 driver Kconfig issue pointed out by you that showed up in -rc1

   - qcom-geni serial driver fixes

   - various 8250 driver fixes for reported problems

   - fsl_lpuart driver fixes

   - serdev fix for regression in -rc1

   - vt.c bugfix

  All have been in linux-next for over a week with no reported problems"

* tag 'tty-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: vt: protect KD_FONT_OP_GET_TALL from unbound access
  serial: qcom-geni: drop bogus uart_write_wakeup()
  serial: qcom-geni: fix mapping of empty DMA buffer
  serial: qcom-geni: fix DMA mapping leak on shutdown
  serial: qcom-geni: fix console shutdown hang
  serdev: Set fwnode for serdev devices
  tty: serial: fsl_lpuart: fix race on RX DMA shutdown
  serial: 8250_pci1xxxx: Disable SERIAL_8250_PCI1XXXX config by default
  serial: 8250_fsl: fix handle_irq locking
  serial: 8250_em: Fix UART port type
  serial: 8250: ASPEED_VUART: select REGMAP instead of depending on it
  tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
  Revert "tty: serial: fsl_lpuart: adjust SERIAL_FSL_LPUART_CONSOLE config dependency"
Since the commit 36e2c74 ("fs: don't allow splice read/write
without explicit ops") is applied to the kernel, splice() and
sendfile() calls on the trace file (/sys/kernel/debug/tracing
/trace) return EINVAL.

This patch restores these system calls by initializing splice_read
in file_operations of the trace file. This patch only enables such
functionalities for the read case.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

Cc: [email protected]
Fixes: 36e2c74 ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Sung-hun Kim <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
The comment refers to mm/slob.c which is being removed. It comes from
commit ed56829 ("ring_buffer: reset buffer page when freeing") and
according to Steven the borrowed code was a page mapcount and mapping
reset, which was later removed by commit e4c2ce8 ("ring_buffer:
allocate buffer page pointer"). Thus the comment is not accurate anyway,
remove it.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

Cc: Masami Hiramatsu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Reported-by: Mike Rapoport <[email protected]>
Suggested-by: Steven Rostedt (Google) <[email protected]>
Fixes: e4c2ce8 ("ring_buffer: allocate buffer page pointer")
Signed-off-by: Vlastimil Babka <[email protected]>
Reviewed-by: Mukesh Ojha <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
There is a problem with the behavior of hwlat in a container,
resulting in incorrect output. A warning message is generated:
"cpumask changed while in round-robin mode, switching to mode none",
and the tracing_cpumask is ignored. This issue arises because
the kernel thread, hwlatd, is not a part of the container, and
the function sched_setaffinity is unable to locate it using its PID.
Additionally, the task_struct of hwlatd is already known.
Ultimately, the function set_cpus_allowed_ptr achieves
the same outcome as sched_setaffinity, but employs task_struct
instead of PID.

Test case:

  # cd /sys/kernel/tracing
  # echo 0 > tracing_on
  # echo round-robin > hwlat_detector/mode
  # echo hwlat > current_tracer
  # unshare --fork --pid bash -c 'echo 1 > tracing_on'
  # dmesg -c

Actual behavior:

[573502.809060] hwlat_detector: cpumask changed while in round-robin mode, switching to mode none

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

Cc: Masami Hiramatsu <[email protected]>
Fixes: 0330f7a ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Costa Shulyupin <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>
…el/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix setting affinity of hwlat threads in containers

   Using sched_set_affinity() has unwanted side effects when being
   called within a container. Use set_cpus_allowed_ptr() instead

 - Fix per cpu thread management of the hwlat tracer:
    - Do not start per_cpu threads if one is already running for the CPU
    - When starting per_cpu threads, do not clear the kthread variable
      as it may already be set to running per cpu threads

 - Fix return value for test_gen_kprobe_cmd()

   On error the return value was overwritten by being set to the result
   of the call from kprobe_event_delete(), which would likely succeed,
   and thus have the function return success

 - Fix splice() reads from the trace file that was broken by commit
   36e2c74 ("fs: don't allow splice read/write without explicit
   ops")

 - Remove obsolete and confusing comment in ring_buffer.c

   The original design of the ring buffer used struct page flags for
   tricks to optimize, which was shortly removed due to them being
   tricks. But a comment for those tricks remained

 - Set local functions and variables to static

* tag 'trace-v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr
  ring-buffer: remove obsolete comment for free_buffer_page()
  tracing: Make splice_read available again
  ftrace: Set direct_ops storage-class-specifier to static
  trace/hwlat: Do not start per-cpu thread if it is already running
  trace/hwlat: Do not wipe the contents of per-cpu thread data
  tracing/osnoise: set several trace_osnoise.c variables storage-class-specifier to static
  tracing: Fix wrong return in kprobe_event_gen_test.c
`ring_interrupt_index` doesn't change the data for `ring` so mark it as
const. This is needed by the following patch that disables interrupt
auto clear for rings.

Cc: Sanju Mehta <[email protected]>
Cc: [email protected]
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
When interrupt auto clear is programmed, any read to the interrupt
status register will clear all interrupts.  If two interrupts have
come in before one can be serviced then this will cause lost interrupts.

On AMD USB4 routers this has manifested in odd problems particularly
with long strings of control tranfers such as reading the DROM via bit
banging.

Instead of clearing interrupts automatically, clear the bit corresponding
to the given ring's interrupt in the ISR.

Fixes: 7a1808f ("thunderbolt: Handle ring interrupt by reading interrupt status register")
Cc: Sanju Mehta <[email protected]>
Cc: [email protected]
Tested-by: Anson Tsao <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
The commit 2357f2b ("drm/i915/mtl: Initial display workarounds")
extended the workaround Wa_16015201720 to MTL. However the registers
that the original WA implemented moved for MTL.

Implement the workaround with the correct register.

v3: Skip clock gating for pipe C, D DMC's and fix the title

Fixes: 2357f2b ("drm/i915/mtl: Initial display workarounds")
Cc: Matt Atwood <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Signed-off-by: Radhakrishna Sripada <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 0188be5)
Signed-off-by: Jani Nikula <[email protected]>
lock the fbdev obj before calling into
i915_vma_pin_iomap(). This helps to solve below :

<7>[   93.563308] i915 0000:00:02.0: [drm:intelfb_create [i915]] no BIOS fb, allocating a new one
<4>[   93.581844] ------------[ cut here ]------------
<4>[   93.581855] WARNING: CPU: 12 PID: 625 at drivers/gpu/drm/i915/gem/i915_gem_pages.c:424 i915_gem_object_pin_map+0x152/0x1c0 [i915]

Fixes: f0b6b01 ("drm/i915: Add ww context to intel_dpt_pin, v2.")
Cc: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Radhakrishna Sripada <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 561b31a)
Signed-off-by: Jani Nikula <[email protected]>
intel_crtc_prepare_cleared_state() is unintentionally losing
the "inherited" flag. This will happen if intel_initial_commit()
is forced to go through the full modeset calculations for
whatever reason.

Afterwards the first real commit from userspace will not get
forced to the full modeset path, and thus eg. audio state may
not get recomputed properly. So if the monitor was already
enabled during boot audio will not work until userspace itself
does an explicit full modeset.

Cc: [email protected]
Tested-by: Lee Shawn C <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Uma Shankar <[email protected]>
(cherry picked from commit 2553bac)
Signed-off-by: Jani Nikula <[email protected]>
The Wa_14017073508 require to send Media Busy/Idle mailbox while
accessing Media tile. As of now it is getting handled while __gt_unpark,
__gt_park. But there are various corner cases where forcewakes are taken
without __gt_unpark i.e. without sending Busy Mailbox especially during
register reads. Forcewakes are taken without busy mailbox leads to
GPU HANG. So bringing mailbox calls under forcewake calls are no feasible
option as forcewake calls are atomic and mailbox calls are blocking.
The issue already fixed in B step so disabling MC6 on A step and
reverting previous commit which handles Wa_14017073508

Fixes: 8f70f1e ("drm/i915/mtl: Add Wa_14017073508 for SAMedia")
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Badal Nilawar <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 038a248)
Signed-off-by: Jani Nikula <[email protected]>
Error captures are tagged with an 'ecode'. This is a pseduo-unique magic
number that is meant to distinguish similar seeming bugs with
different underlying signatures. It is a combination of two ring state
registers. Unfortunately, the register state being used is only valid
in execlist mode. In GuC mode, the register state exists in a separate
list of arbitrary register address/value pairs rather than the named
entry structure. So, search through that list to find the two exciting
registers and copy them over to the structure's named members.

v2: if else if instead of if if (Alan)

Signed-off-by: John Harrison <[email protected]>
Reviewed-by: Alan Previn <[email protected]>
Fixes: a6f0f9c ("drm/i915/guc: Plumb GuC-capture into gpu_coredump")
Cc: Alan Previn <[email protected]>
Cc: Umesh Nerlige Ramappa <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Aravind Iddamsetty <[email protected]>
Cc: Michael Cheng <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Bruce Chang <[email protected]>
Cc: Daniele Ceraolo Spurio <[email protected]>
Cc: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 9724ecd)
Signed-off-by: Jani Nikula <[email protected]>
debug_active_activate() expected ref->count to be zero
which is not true anymore as __i915_active_activate() calls
debug_active_activate() after incrementing the count.

v2: No need to check for "ref->count == 1" as __i915_active_activate()
already make sure of that(Janusz).

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6733
Fixes: 04240e3 ("drm/i915: Skip taking acquire mutex for no ref->active callback")
Cc: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: Andi Shyti <[email protected]>
Cc: [email protected]
Cc: Janusz Krzysztofik <[email protected]>
Cc: <[email protected]> # v5.10+
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Janusz Krzysztofik <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit bfad380)
Signed-off-by: Jani Nikula <[email protected]>
Probe pseudo errors should be injected only in places where real errors
can be encountered, otherwise unwinding code can be broken.
Placing intel_uc_init_late before i915_inject_probe_error violated
this rule, resulting in following bug:
__intel_gt_disable:655 GEM_BUG_ON(intel_gt_pm_is_awake(gt))

Fixes: 481d458 ("drm/i915/guc: Add golden context to GuC ADS")
Acked-by: Nirmoy Das <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit c4252a1)
Signed-off-by: Jani Nikula <[email protected]>
Use hex format so that it is easier to decode.

Fixes: fe59796 ("drm/i915/debugfs: Add perf_limit_reasons in debugfs")
Signed-off-by: Vinay Belgaumkar <[email protected]>
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 5e008ba)
Signed-off-by: Jani Nikula <[email protected]>
When we change the M/N values seamlessly during a fastset we should
also update the vblank timestamping stuff to make sure the vblank
timestamp corrections/guesstimations come out exact.

Note that only crtc_clock and framedur_ns can actually end up
changing here during fastsets. Everything else we touch can
only change during full modesets.

Technically we should try to do this exactly at the start of
vblank, but that would require some kind of double buffering
scheme. Let's skip that for now and just update things right
after the commit has been submitted to the hardware. This
means the information will be properly up to date when the
vblank irq handler goes to work. Only if someone ends up
querying some vblanky stuff in between the commit and start
of vblank may we see a slight discrepancy.

Also this same problem really exists for the DRRS downclocking
stuff. But as that is supposed to be more or less transparent
to the user, and it only drops to low gear after a long delay
(1 sec currently) we probably don't have to worry about it.
Any time something is actively submitting updates DRRS will
remain in high gear and so the timestamping constants will
match the hardware state.

Reviewed-by: Jani Nikula <[email protected]>
Reviewed-by: Mitul Golani <[email protected]>
Fixes: e6f2992 ("drm/i915: Allow M/N change during fastset on bdw+")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 8cb1f95)
Signed-off-by: Jani Nikula <[email protected]>
We collect the software statistics counters for RX bytes (reported to
/proc/net/dev and to ethtool -S $dev | grep 'rx_bytes: ") at a time when
skb->len has already been adjusted by the eth_type_trans() ->
skb_pull_inline(skb, ETH_HLEN) call to exclude the L2 header.

This means that when connecting 2 DSA interfaces back to back and
sending 1 packet with length 100, the sending interface will report
tx_bytes as incrementing by 100, and the receiving interface will report
rx_bytes as incrementing by 86.

Since accounting for that in scripts is quirky and is something that
would be DSA-specific behavior (requiring users to know that they are
running on a DSA interface in the first place), the proposal is that we
treat it as a bug and fix it.

This design bug has always existed in DSA, according to my analysis:
commit 91da11f ("net: Distributed Switch Architecture protocol
support") also updates skb->dev->stats.rx_bytes += skb->len after the
eth_type_trans() call. Technically, prior to Florian's commit
a86d8be ("net: dsa: Factor bottom tag receive functions"), each and
every vendor-specific tagging protocol driver open-coded the same bug,
until the buggy code was consolidated into something resembling what can
be seen now. So each and every driver should have its own Fixes: tag,
because of their different histories until the convergence point.
I'm not going to do that, for the sake of simplicity, but just blame the
oldest appearance of buggy code.

There are 2 ways to fix the problem. One is the obvious way, and the
other is how I ended up doing it. Obvious would have been to move
dev_sw_netstats_rx_add() one line above eth_type_trans(), and below
skb_push(skb, ETH_HLEN). But DSA processing is not as simple as that.
We count the bytes after removing everything DSA-related from the
packet, to emulate what the packet's length was, on the wire, when the
user port received it.

When eth_type_trans() executes, dsa_untag_bridge_pvid() has not run yet,
so in case the switch driver requests this behavior - commit
412a152 ("net: dsa: untag the bridge pvid from rx skbs") has the
details - the obvious variant of the fix wouldn't have worked, because
the positioning there would have also counted the not-yet-stripped VLAN
header length, something which is absent from the packet as seen on the
wire (there it may be untagged, whereas software will see it as
PVID-tagged).

Fixes: f613ed6 ("net: dsa: Add support for 64-bit statistics")
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
pcacjr and others added 25 commits March 24, 2023 14:37
Get rid of any prefix paths in @path before lookup_positive_unlocked()
as it will call ->lookup() which already adds those prefix paths
through build_path_from_dentry().

This has caused a performance regression when mounting shares with a
prefix path where readdir(2) would end up retrying several times to
open bad directory names that contained duplicate prefix paths.

Fix this by skipping any prefix paths in @path before calling
lookup_positive_unlocked().

Fixes: e4029e0 ("cifs: find and use the dentry for cached non-root directories also")
Cc: [email protected] # 6.1+
Signed-off-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
If user does forced unmount ("umount -f") while files are still open
on the share (as was seen in a Kubernetes example running on SMB3.1.1
mount) then we were marking the share as "TID_EXITING" in umount_begin()
which caused all subsequent operations (except write) to fail ... but
unfortunately when umount_begin() is called we do not know yet that
there are open files or active references on the share that would prevent
unmount from succeeding.  Kubernetes had example when they were doing
umount -f when files were open which caused the share to become
unusable until the files were closed (and the umount retried).

Fix this so that TID_EXITING is not set until we are about to send
the tree disconnect (not at the beginning of forced umounts in
umount_begin) so that if "umount -f" fails (due to open files or
references) the mount is still usable.

Cc: [email protected]
Reviewed-by: Shyam Prasad N <[email protected]>
Reviewed-by: Paulo Alcantara (SUSE) <[email protected]>
Signed-off-by: Steve French <[email protected]>
At some point in between sending this patch to the list and merging it
into for-next, the tracepoints got all mixed up because I've
over-reliant on automated tools not sucking.  The end result is that the
tracepoints are all wrong, so fix them.

Signed-off-by: Darrick J. Wong <[email protected]>
…/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These add new ACPI IRQ override and backlight detection quirks.

  Specifics:

   - Add backlight=native DMI quirk for Acer Aspire 3830TG to the ACPI
     backlight driver (Hans de Goede)

   - Add an ACPI IRQ override quirk for Medion S17413 (Aymeric Wibo)"

* tag 'acpi-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: resource: Add Medion S17413 to IRQ override quirk
  ACPI: video: Add backlight=native DMI quirk for Acer Aspire 3830TG
…nel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "These address two recent regressions related to thermal control.

  Specifics:

   - Restore the thermal core behavior regarding zero-temperature trip
     points to avoid a driver regression (Ido Schimmel)

   - Fix a recent regression in the ACPI processor driver preventing it
     from changing the number of CPU cooling device states exposed via
     sysfs after the given CPU cooling device has been registered
     (Rafael Wysocki)"

* tag 'thermal-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: core: Restore behavior regarding invalid trip points
  ACPI: processor: thermal: Update CPU cooling devices on cpufreq policy changes
  thermal: core: Introduce thermal_cooling_device_update()
  thermal: core: Introduce thermal_cooling_device_present()
  ACPI: processor: Reorder acpi_processor_driver_init()
Pull io_uring fixes from Jens Axboe:

 - Fix an issue with repeated -ECONNREFUSED on a socket (me)

 - Fix a NULL pointer deference due to a stale lookup cache for
   allocating direct descriptors (Savino)

* tag 'io_uring-6.3-2023-03-24' of git://git.kernel.dk/linux:
  io_uring/rsrc: fix null-ptr-deref in io_file_bitmap_get()
  io_uring/net: avoid sending -ECONNABORTED on repeated connection requests
Pull block fixes from Jens Axboe:

 - NVMe pull request via Christoph:
     - Send Identify with CNS 06h only to I/O controllers (Martin
       George)
     - Fix nvme_tcp_term_pdu to match spec (Caleb Sander)

 - Pass in issue_flags for uring_cmd, so the end_io handlers don't need
   to assume what the right context is (me)

 - Fix for ublk, marking it as LIVE before adding it to avoid races on
   the initial IO (Ming)

* tag 'block-6.3-2023-03-24' of git://git.kernel.dk/linux:
  nvme-tcp: fix nvme_tcp_term_pdu to match spec
  nvme: send Identify with CNS 06h only to I/O controllers
  block/io_uring: pass in issue_flags for uring_cmd task_work handling
  block: ublk_drv: mark device as LIVE before adding disk
…rnel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - Fix DM thin to work as a swap device by using 'limit_swap_bios' DM
   target flag (initially added to allow swap to dm-crypt) to throttle
   the amount of outstanding swap bios.

 - Fix DM crypt soft lockup warnings by calling cond_resched() from the
   cpu intensive loop in dmcrypt_write().

 - Fix DM crypt to not access an uninitialized tasklet. This fix allows
   for consistent handling of IO completion, by _not_ needlessly punting
   to a workqueue when tasklets are not needed.

 - Fix DM core's alloc_dev() initialization for DM stats to check for
   and propagate alloc_percpu() failure.

* tag 'for-6.3/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm stats: check for and propagate alloc_percpu failure
  dm crypt: avoid accessing uninitialized tasklet
  dm crypt: add cond_resched() to dmcrypt_write()
  dm thin: fix deadlock when swapping to thin device
…/drm

Pull drm fixes from Daniel Vetter:

 - usual pile of fixes for amdgpu & i915

 - probe error handling fixes for meson, lt8912b bridge

 - the host1x patch from Arnd

 - panel-orientation fix for Lenovo Book X90F

* tag 'drm-fixes-2023-03-24' of git://anongit.freedesktop.org/drm/drm: (23 commits)
  gpu: host1x: fix uninitialized variable use
  drm/amd/display: Set dcn32 caps.seamless_odm
  drm/amd/display: fix wrong index used in dccg32_set_dpstreamclk
  drm/amdgpu/nv: Apply ASPM quirk on Intel ADL + AMD Navi
  drm/amd/display: remove outdated 8bpc comments
  drm/amdgpu/gfx: set cg flags to enter/exit safe mode
  drm/amdgpu: Force signal hw_fences that are embedded in non-sched jobs
  drm/amdgpu: add mes resume when do gfx post soft reset
  drm/amdgpu: skip ASIC reset for APUs when go to S4
  drm/amdgpu: reposition the gpu reset checking for reuse
  drm/bridge: lt8912b: return EPROBE_DEFER if bridge is not found
  drm/meson: fix missing component unbind on bind errors
  drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F
  Revert "drm/i915/hwmon: Enable PL1 power limit"
  drm/i915: Update vblank timestamping stuff on seamless M/N change
  drm/i915: Fix format for perf_limit_reasons
  drm/i915/gt: perform uc late init after probe error injection
  drm/i915/active: Fix missing debug object activation
  drm/i915/guc: Fix missing ecodes
  drm/i915/mtl: Disable MC6 for MTL A step
  ...
…git/sre/linux-power-supply

Pull power supply fixes from Sebastian Reichel:

 - rk817: Fix compiler warning

 - cros_usbpd-charger: Fix excessive error printing

 - axp288_fuel_gauge: handle platform_get_irq error

 - bq24190 and da9150: Fix race condition in remove path

* tag 'for-v6.3-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: da9150: Fix use after free bug in da9150_charger_remove due to race condition
  power: supply: bq24190: Fix use after free bug in bq24190_remove due to race condition
  power: supply: axp288_fuel_gauge: Added check for negative values
  power: supply: cros_usbpd: reclassify "default case!" as debug
  power: supply: rk817: Fix unsigned comparison with less than zero
…nel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "As usual, most of the bug fixes address issues in the devicetree
  files, and out of these, most are for the Qualcomm and NXP platforms,
  including:

   - A missing 'reserved-memory' property on LG G Watch R that is needed
     to prevent clashing with firmware

   - Annotations for cache coherency on multiple machines

   - Corrections for pinctrl, regulator, clock, iommu and power domain
     properties for i.MX and Qualcomm to correctly reflect the hardware
     settings

   - Firmware file names on multiple machines SA8540P Ride board

   - An incompatible change to the qcom vadc driver requires adding
     individual labels

   - Fix EQoS PHY reset GPIO by dropping the deprecated/wrong property
     and switch to the new bindings.

   - A fix for PCI bus address translation Tegra194 and Tegra234.

  There are also a couple of device driver fixes, addressing:

   - A race condition in the amdtee driver

   - A performance regression in the Qualcomm 'llcc' driver

   - An unitialized variable use NXP i.MX 'weim' driver

   - Error handling issues in Qualcomm 'rmtfs', and 'scm' drivers and
     the Arm scmi firmware driver"

* tag 'arm-fixes-6.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (48 commits)
  arm64: dts: qcom: sc8280xp-x13s: mark bob regulator as always-on
  arm64: dts: qcom: sc8280xp-x13s: mark s12b regulator as always-on
  arm64: dts: qcom: sc8280xp-x13s: mark s10b regulator as always-on
  arm64: dts: qcom: sc8280xp-x13s: mark s11b regulator as always-on
  arm64: dts: imx93: add missing #address-cells and #size-cells to i2c nodes
  bus: imx-weim: fix branch condition evaluates to a garbage value
  arm64: dts: imx8mn: specify #sound-dai-cells for SAI nodes
  ARM: dts: imx6sl: tolino-shine2hd: fix usbotg1 pinctrl
  ARM: dts: imx6sll: e60k02: fix usbotg1 pinctrl
  ARM: dts: imx6sll: e70k02: fix usbotg1 pinctrl
  arm64: dts: imx93: Fix eqos properties
  arm64: dts: imx8mp: Fix LCDIF2 node clock order
  arm64: dts: imx8mm-nitrogen-r2: fix WM8960 clock name
  arm64: dts: imx8dxl-evk: Fix eqos phy reset gpio
  firmware: qcom: scm: fix bogus irq error at probe
  arm64: dts: qcom: sm8550: Mark UFS controller as cache coherent
  arm64: dts: qcom: sa8540p-ride: correct name of remoteproc_nsp0 firmware
  arm64: dts: qcom: sm8450: Mark UFS controller as cache coherent
  arm64: dts: qcom: sm8350: Mark UFS controller as cache coherent
  arm64: dts: qcom: sm8550: fix LPASS pinctrl slew base address
  ...
Pull ksmbd server fixes from Steve French:

 - return less confusing messages on unsupported dialects
   (STATUS_NOT_SUPPORTED instead of I/O error)

 - fix for overly frequent inactive session termination

 - fix refcount leak

 - fix bounds check problems found by static checkers

 - fix to advertise named stream support correctly

 - Fix AES256 signing bug when connected to from MacOS

* tag '6.3-rc3-ksmbd-smb3-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: return unsupported error on smb1 mount
  ksmbd: return STATUS_NOT_SUPPORTED on unsupported smb2.0 dialect
  ksmbd: don't terminate inactive sessions after a few seconds
  ksmbd: fix possible refcount leak in smb2_open()
  ksmbd: add low bound validation to FSCTL_QUERY_ALLOCATED_RANGES
  ksmbd: add low bound validation to FSCTL_SET_ZERO_DATA
  ksmbd: set FILE_NAMED_STREAMS attribute in FS_ATTRIBUTE_INFORMATION
  ksmbd: fix wrong signingkey creation when encryption is AES256
…rg/pub/scm/linux/kernel/git/akpm/mm

Pull misc fixes from Andrew Morton:
 "21 hotfixes, 8 of which are cc:stable. 11 are for MM, the remainder
  are for other subsystems"

* tag 'mm-hotfixes-stable-2023-03-24-17-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (21 commits)
  mm: mmap: remove newline at the end of the trace
  mailmap: add entries for Richard Leitner
  kcsan: avoid passing -g for test
  kfence: avoid passing -g for test
  mm: kfence: fix using kfence_metadata without initialization in show_object()
  lib: dhry: fix unstable smp_processor_id(_) usage
  mailmap: add entry for Enric Balletbo i Serra
  mailmap: map Sai Prakash Ranjan's old address to his current one
  mailmap: map Rajendra Nayak's old address to his current one
  Revert "kasan: drop skip_kasan_poison variable in free_pages_prepare"
  mailmap: add entry for Tobias Klauser
  kasan, powerpc: don't rename memintrinsics if compiler adds prefixes
  mm/ksm: fix race with VMA iteration and mm_struct teardown
  kselftest: vm: fix unused variable warning
  mm: fix error handling for map_deny_write_exec
  mm: deduplicate error handling for map_deny_write_exec
  checksyscalls: ignore fstat to silence build warning on LoongArch
  nilfs2: fix kernel-infoleak in nilfs_ioctl_wrap_copy()
  test_maple_tree: add more testing for mas_empty_area()
  maple_tree: fix mas_skip_node() end slot detection
  ...
…kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - it87: Fix voltage scaling for chips with 10.9mV ADCs

 - xgene: Fix ioremap and memremap leak

 - peci/cputemp: Fix miscalculated DTS temperature for SKX

 - hwmon core: fix potential sensor registration failure with thermal
   subsystem if of_node is missing

* tag 'hwmon-for-v6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon (it87): Fix voltage scaling for chips with 10.9mV  ADCs
  hwmon: (xgene) Fix ioremap and memremap leak
  hwmon: fix potential sensor registration fail if of_node is missing
  hwmon: (peci/cputemp) Fix miscalculated DTS for SKX
…s-linux

Pull xfs fixes from Darrick Wong:
 "This batch started with some debugging enhancements to the new
  allocator refactoring that we put in 6.3-rc1 to assist developers in
  rebasing their dev branches.

  As for more serious code changes -- there's a bug fix to make the
  lockless allocator scan the whole filesystem before resorting to the
  locking allocator. We're also adding a selftest for the venerable
  directory/xattr hash function to make sure that it produces consistent
  results so that we can address any fallout as soon as possible.

   - Add a few debugging assertions so that people (me) trying to port
     code to the new allocator functions don't mess up the caller
     requirements

   - Relax some overly cautious lock ordering enforcement in the new
     allocator code, which means that file allocations will locklessly
     scan for the best space they can get before backing off to the
     traditional lock-and-really-get-it behavior

   - Add tracepoints to make it easier to trace the xfs allocator
     behavior

   - Actually test the dir/xattr hash algorithm to make sure it produces
     consistent results across all the platforms XFS supports"

* tag 'xfs-6.3-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: test dir/attr hash when loading module
  xfs: add tracepoints for each of the externally visible allocators
  xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags
  xfs: try to idiot-proof the allocators
…s-linux

Pull xfs percpu counter fixes from Darrick Wong:
 "We discovered a filesystem summary counter corruption problem that was
  traced to cpu hot-remove racing with the call to percpu_counter_sum
  that sets the free block count in the superblock when writing it to
  disk. The root cause is that percpu_counter_sum doesn't cull from
  dying cpus and hence misses those counter values if the cpu shutdown
  hooks have not yet run to merge the values.

  I'm hoping this is a fairly painless fix to the problem, since the
  dying cpu mask should generally be empty. It's been in for-next for a
  week without any complaints from the bots.

   - Fix a race in the percpu counters summation code where the
     summation failed to add in the values for any CPUs that were dying
     but not yet dead. This fixes some minor discrepancies and incorrect
     assertions when running generic/650"

* tag 'xfs-6.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  pcpcntr: remove percpu_counter_sum_all()
  fork: remove use of percpu_counter_sum_all
  pcpcntrs: fix dying cpu summation race
  cpumask: introduce for_each_cpu_or
…s-linux

Pull yet more xfs bug fixes from Darrick Wong:
 "The first bugfix addresses a longstanding problem where we use the
  wrong file mapping cursors when trying to compute the speculative
  preallocation quantity. This has been causing sporadic crashes when
  alwayscow mode is engaged.

  The other two fixes correct minor problems in more recent changes.

   - Fix the new allocator tracepoints because git am mismerged the
     changes such that the trace_XXX got rebased to be in function YYY
     instead of XXX

   - Ensure that the perag AGFL_RESET state is consistent with whatever
     we've just read off the disk

   - Fix a bug where we used the wrong iext cursor during a write begin"

* tag 'xfs-6.3-fixes-7' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix mismerged tracepoints
  xfs: clear incore AGFL_RESET state if it's not needed
  xfs: pass the correct cursor to xfs_iomap_prealloc_size
…it/cel/linux

Pull nfsd fix from Chuck Lever:

 - Fix a crash when using NFS with krb5p

* tag 'nfsd-6.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  SUNRPC: Fix a crash in gss_krb5_checksum()
…cifs-2.6

Pull cifs client fixes from Steve French:
 "Twelve cifs/smb3 client fixes (most also for stable)

   - forced umount fix

   - fix for two perf regressions

   - reconnect fixes

   - small debugging improvements

   - multichannel fixes"

* tag 'smb3-client-fixes-6.3-rc3' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: fix unusable share after force unmount failure
  cifs: fix dentry lookups in directory handle cache
  smb3: lower default deferred close timeout to address perf regression
  cifs: fix missing unload_nls() in smb2_reconnect()
  cifs: avoid race conditions with parallel reconnects
  cifs: append path to open_enter trace event
  cifs: print session id while listing open files
  cifs: dump pending mids for all channels in DebugData
  cifs: empty interface list when server doesn't support query interfaces
  cifs: do not poll server interfaces too regularly
  cifs: lock chan_lock outside match_session
  cifs: check only tcon status on tcon related functions
…inux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Add a AMX ptrace self test

 - Prevent a false-positive warning when retrieving the (invalid)
   address of dynamic FPU features in their init state which are not
   saved in init_fpstate at all

 - Randomize per-CPU entry areas only when KASLR is enabled

* tag 'x86_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/x86/amx: Add a ptrace test
  x86/fpu/xstate: Prevent false-positive warning in __copy_xstate_uabi_buf()
  x86/mm: Do not shuffle CPU entry areas without KASLR
…linux/kernel/git/tip/tip

Pull core fixes from Borislav Petkov:

 - Do the delayed RCU wakeup for kthreads in the proper order so that
   former doesn't get ignored

 - A noinstr warning fix

* tag 'core_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up
  entry: Fix noinstr warning in __enter_from_user_mode()
…linux/kernel/git/tip/tip

Pull perf fix from Borislav Petkov:

 - Properly clear perf event status tracking in the AMD perf event
   overflow handler

* tag 'perf_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/amd/core: Always clear status for idx
…/linux/kernel/git/tip/tip

Pull scheduler fix from Borislav Petkov:

 - Fix a corner case where vruntime of a task is not being sanitized

* tag 'sched_urgent_for_v6.3_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Sanitize vruntime of entity being migrated
…git/gregkh/usb

Pull USB / Thunderbolt driver fixes from Greg KH:
 "Here are a small set of USB and Thunderbolt driver fixes for reported
  problems and a documentation update, for 6.3-rc4.

  Included in here are:

   - documentation update for uvc gadget driver

   - small thunderbolt driver fixes

   - cdns3 driver fixes

   - dwc3 driver fixes

   - dwc2 driver fixes

   - chipidea driver fixes

   - typec driver fixes

   - onboard_usb_hub device id updates

   - quirk updates

  All of these have been in linux-next with no reported problems"

* tag 'usb-6.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (30 commits)
  usb: dwc2: fix a race, don't power off/on phy for dual-role mode
  usb: dwc2: fix a devres leak in hw_enable upon suspend resume
  usb: chipidea: core: fix possible concurrent when switch role
  usb: chipdea: core: fix return -EINVAL if request role is the same with current role
  thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit
  thunderbolt: Disable interrupt auto clear for rings
  thunderbolt: Use const qualifier for `ring_interrupt_index`
  usb: gadget: Use correct endianness of the wLength field for WebUSB
  uas: Add US_FL_NO_REPORT_OPCODES for JMicron JMS583Gen 2
  usb: cdnsp: changes PCI Device ID to fix conflict with CNDS3 driver
  usb: cdns3: Fix issue with using incorrect PCI device function
  usb: cdnsp: Fixes issue with redundant Status Stage
  MAINTAINERS: make me a reviewer of USB/IP
  thunderbolt: Use scale field when allocating USB3 bandwidth
  thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
  thunderbolt: Call tb_check_quirks() after initializing adapters
  thunderbolt: Add missing UNSET_INBOUND_SBTX for retimer access
  thunderbolt: Fix memory leak in margining
  usb: dwc2: drd: fix inconsistent mode if role-switch-default-mode="host"
  docs: usb: Add documentation for the UVC Gadget
  ...
Copy link
Member

@pkaminski pkaminski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 0 of 1148 files reviewed, 1 unresolved discussion (waiting on @fahhem)


.gitignore line 81 at r1 (raw file):

#
/*.spec
/rpmbuild/

Random comment.

fahhem pushed a commit that referenced this pull request Apr 19, 2023
…kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 6.3, part #3

 - Ensure the guest PMU context is restored before the first KVM_RUN,
   fixing an issue where EL0 event counting is broken after vCPU
   save/restore

 - Actually initialize ID_AA64PFR0_EL1.{CSV2,CSV3} based on the
   sanitized, system-wide values for protected VMs
fahhem pushed a commit that referenced this pull request Apr 19, 2023
…ct_cfm

This attempts to fix the following trace:

======================================================
WARNING: possible circular locking dependency detected
6.3.0-rc2-g0b93eeba4454 #4703 Not tainted
------------------------------------------------------
kworker/u3:0/46 is trying to acquire lock:
ffff888001fd9130 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}, at:
sco_connect_cfm+0x118/0x4a0

but task is already holding lock:
ffffffff831e3340 (hci_cb_list_lock){+.+.}-{3:3}, at:
hci_sync_conn_complete_evt+0x1ad/0x3d0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (hci_cb_list_lock){+.+.}-{3:3}:
       __mutex_lock+0x13b/0xcc0
       hci_sync_conn_complete_evt+0x1ad/0x3d0
       hci_event_packet+0x55c/0x7c0
       hci_rx_work+0x34c/0xa00
       process_one_work+0x575/0x910
       worker_thread+0x89/0x6f0
       kthread+0x14e/0x180
       ret_from_fork+0x2b/0x50

-> #1 (&hdev->lock){+.+.}-{3:3}:
       __mutex_lock+0x13b/0xcc0
       sco_sock_connect+0xfc/0x630
       __sys_connect+0x197/0x1b0
       __x64_sys_connect+0x37/0x50
       do_syscall_64+0x42/0x90
       entry_SYSCALL_64_after_hwframe+0x70/0xda

-> #0 (sk_lock-AF_BLUETOOTH-BTPROTO_SCO){+.+.}-{0:0}:
       __lock_acquire+0x18cc/0x3740
       lock_acquire+0x151/0x3a0
       lock_sock_nested+0x32/0x80
       sco_connect_cfm+0x118/0x4a0
       hci_sync_conn_complete_evt+0x1e6/0x3d0
       hci_event_packet+0x55c/0x7c0
       hci_rx_work+0x34c/0xa00
       process_one_work+0x575/0x910
       worker_thread+0x89/0x6f0
       kthread+0x14e/0x180
       ret_from_fork+0x2b/0x50

other info that might help us debug this:

Chain exists of:
  sk_lock-AF_BLUETOOTH-BTPROTO_SCO --> &hdev->lock --> hci_cb_list_lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(hci_cb_list_lock);
                               lock(&hdev->lock);
                               lock(hci_cb_list_lock);
  lock(sk_lock-AF_BLUETOOTH-BTPROTO_SCO);

 *** DEADLOCK ***

4 locks held by kworker/u3:0/46:
 #0: ffff8880028d1130 ((wq_completion)hci0#2){+.+.}-{0:0}, at:
 process_one_work+0x4c0/0x910
 #1: ffff8880013dfde0 ((work_completion)(&hdev->rx_work)){+.+.}-{0:0},
 at: process_one_work+0x4c0/0x910
 #2: ffff8880025d8070 (&hdev->lock){+.+.}-{3:3}, at:
 hci_sync_conn_complete_evt+0xa6/0x3d0
 #3: ffffffffb79e3340 (hci_cb_list_lock){+.+.}-{3:3}, at:
 hci_sync_conn_complete_evt+0x1ad/0x3d0

Signed-off-by: Luiz Augusto von Dentz <[email protected]>
fahhem added a commit that referenced this pull request Apr 19, 2023
commit 76dcd734eca23168cb008912c0f69ff408905235
Author: Linus Torvalds <[email protected]>
Date:   Sun Dec 4 14:48:12 2022 -0800

    Linux 6.1-rc8

commit 0ba09b1733878afe838fe35c310715fda3d46428
Author: Linus Torvalds <[email protected]>
Date:   Sun Dec 4 12:51:59 2022 -0800

    Revert "mm: align larger anonymous mappings on THP boundaries"

    This reverts commit f35b5d7d676e59e401690b678cd3cfec5e785c23.

    It has been reported to cause huge performance regressions on some loads
    (will-it-scale.per_process_ops, but also building the kernel with
    clang).

    The commit did speed up gcc builds by a small amount, so it's not an
    unambiguous regression, but until the big regressions are understood,
    let's revert it.

    Reported-by: kernel test robot <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Reported-by: Nathan Chancellor <[email protected]>
    Link: https://lore.kernel.org/lkml/Y1DNQaoPWxE%[email protected]/
    Cc: Huang, Ying <[email protected]>
    Cc: Rik van Riel <[email protected]>
    Cc: Andrew Morton <[email protected]>
    Cc: Yang Shi <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 23393c6461422df5bf8084a086ada9a7e17dc2ba
Author: Jan Dabros <[email protected]>
Date:   Mon Nov 28 20:56:51 2022 +0100

    char: tpm: Protect tpm_pm_suspend with locks

    Currently tpm transactions are executed unconditionally in
    tpm_pm_suspend() function, which may lead to races with other tpm
    accessors in the system.

    Specifically, the hw_random tpm driver makes use of tpm_get_random(),
    and this function is called in a loop from a kthread, which means it's
    not frozen alongside userspace, and so can race with the work done
    during system suspend:

      tpm tpm0: tpm_transmit: tpm_recv: error -52
      tpm tpm0: invalid TPM_STS.x 0xff, dumping stack for forensics
      CPU: 0 PID: 1 Comm: init Not tainted 6.1.0-rc5+ #135
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-20220807_005459-localhost 04/01/2014
      Call Trace:
       tpm_tis_status.cold+0x19/0x20
       tpm_transmit+0x13b/0x390
       tpm_transmit_cmd+0x20/0x80
       tpm1_pm_suspend+0xa6/0x110
       tpm_pm_suspend+0x53/0x80
       __pnp_bus_suspend+0x35/0xe0
       __device_suspend+0x10f/0x350

    Fix this by calling tpm_try_get_ops(), which itself is a wrapper around
    tpm_chip_start(), but takes the appropriate mutex.

    Signed-off-by: Jan Dabros <[email protected]>
    Reported-by: Vlastimil Babka <[email protected]>
    Tested-by: Jason A. Donenfeld <[email protected]>
    Tested-by: Vlastimil Babka <[email protected]>
    Link: https://lore.kernel.org/all/[email protected]/
    Cc: [email protected]
    Fixes: e891db1a18bf ("tpm: turn on TPM on suspend for TPM 1.x")
    [Jason: reworked commit message, added metadata]
    Signed-off-by: Jason A. Donenfeld <[email protected]>
    Reviewed-by: Jarkko Sakkinen <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 0c3b5bcb484a659dd14466f92a073b57b2d3c1a5
Merge: eea8bebd5173 517e6a301f34
Author: Linus Torvalds <[email protected]>
Date:   Sun Dec 4 12:36:23 2022 -0800

    Merge tag 'perf_urgent_for_v6.1_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

    Pull perf fix from Borislav Petkov:

     - Fix a use-after-free case where the perf pending task callback would
       see an already freed event

    * tag 'perf_urgent_for_v6.1_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      perf: Fix perf_pending_task() UaF

commit eea8bebd51739cc7a3bb501032ee877b4aada553
Merge: ae6bb7171171 d9f15a9de44a
Author: Linus Torvalds <[email protected]>
Date:   Sun Dec 4 12:33:44 2022 -0800

    Merge tag 'timers_urgent_for_v6.1_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

    Pull timer fix from Borislav Petkov:

     - Revert a fix to RISC-V timers supposed to address an uncertainty
       whether clock events are received during S3 or not which locks up
       other RISC-V platforms. The issue will be fixed differently later.

    * tag 'timers_urgent_for_v6.1_rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
      Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"

commit ae6bb71711713e7b6b2d9ed2af7318dc8ae5cfb2
Merge: 50f36c5aa12c 2e7ec190a0e3
Author: Linus Torvalds <[email protected]>
Date:   Sun Dec 4 12:24:58 2022 -0800

    Merge tag 'powerpc-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

    Pull powerpc fixes from Michael Ellerman:

     - Fix oops in 32-bit BPF tail call tests

     - Add missing declaration for machine_check_early_boot()

    Thanks to Christophe Leroy and Naveen N. Rao.

    * tag 'powerpc-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
      powerpc/64s: Add missing declaration for machine_check_early_boot()
      powerpc/bpf/32: Fix Oops on tail call tests

commit 50f36c5aa12c8d0f2adca662e0c106212ea897a8
Merge: c2bf05db6c78 8c9a59939deb
Author: Linus Torvalds <[email protected]>
Date:   Sun Dec 4 12:18:37 2022 -0800

    Merge tag 'input-for-v6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

    Pull input fix from Dmitry Torokhov:

     - a fix for Raydium touchscreen driver to stop leaking memory when
       sending commands to the chip

    * tag 'input-for-v6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
      Input: raydium_ts_i2c - fix memory leak in raydium_i2c_send()

commit c2bf05db6c78f53ca5cd4b48f3b9b71f78d215f1
Merge: 6085bc95797c d36678f7905c
Author: Linus Torvalds <[email protected]>
Date:   Sat Dec 3 13:51:37 2022 -0800

    Merge tag 'i2c-for-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

    Pull i2c fixes from Wolfram Sang:
     "A power state fix in the core for ACPI devices, a regression fix
      regarding bus recovery for the cadence driver, a DMA handling fix for
      the imx driver, and two error path fixes (npcm7xx and qcom-geni)"

    * tag 'i2c-for-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
      i2c: imx: Only DMA messages with I2C_M_DMA_SAFE flag set
      i2c: qcom-geni: fix error return code in geni_i2c_gpi_xfer
      i2c: cadence: Fix regression with bus recovery
      i2c: Restore initial power state if probe fails
      i2c: npcm7xx: Fix error handling in npcm_i2c_init()

commit 6085bc95797caa55a68bc0f7dd73e8c33e91037f
Merge: 97ee9d1c1696 472faf72b33d
Author: Linus Torvalds <[email protected]>
Date:   Sat Dec 3 13:43:38 2022 -0800

    Merge tag 'dax-fixes-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

    Pull dax fixes from Dan Williams:
     "A few bug fixes around the handling of "Soft Reserved" memory and
      memory tiering information.

      Linux is starting to enounter more real world systems that deploy an
      ACPI HMAT to describe different performance classes of memory, as well
      the "special purpose" (Linux "Soft Reserved") designation from EFI.

      These fixes result from that testing.

      It has all appeared in -next for a while with no known issues.

       - Fix duplicate overlapping device-dax instances for HMAT described
         "Soft Reserved" Memory

       - Fix missing node targets in the sysfs representation of memory
         tiers

       - Remove a confusing variable initialization"

    * tag 'dax-fixes-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
      device-dax: Fix duplicate 'hmem' device registration
      ACPI: HMAT: Fix initiator registration for single-initiator systems
      ACPI: HMAT: remove unnecessary variable initialization

commit 97ee9d1c16963375eefdf964c429897d27e28956
Merge: 63050a5ca130 d0f411c0b9bd
Author: Linus Torvalds <[email protected]>
Date:   Fri Dec 2 16:27:15 2022 -0800

    Merge tag 'block-6.1-2022-12-02' of git://git.kernel.dk/linux

    Pull block fixes from Jens Axboe:
     "Just a small NVMe merge for this week, fixing protection of the name
      space list, and a missing clear of a reserved field when unused"

    * tag 'block-6.1-2022-12-02' of git://git.kernel.dk/linux:
      nvme: fix SRCU protection of nvme_ns_head list
      nvme-pci: clear the prp2 field when not used

commit 63050a5ca130e76af7199c9a3fba1d175f3a1102
Merge: 0e15c3c75a28 6989ea4881c8
Author: Linus Torvalds <[email protected]>
Date:   Fri Dec 2 16:22:17 2022 -0800

    Merge tag 'pinctrl-v6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl

    Pull pin control fixes from Linus Walleij:
     "Three driver fixes. The Intel fix looks like the most important.

       - Fix a potential divide by zero in pinctrl-singe (OMAP and
         HiSilicon)

       - Disable IRQs on startup in the Mediatek driver. This is a classic,
         we should be looking out for this more.

       - Save and restore pins in 'direct IRQ' mode in the Intel driver,
         this works around firmware bugs"

    * tag 'pinctrl-v6.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
      pinctrl: intel: Save and restore pins in "direct IRQ" mode
      pinctrl: meditatek: Startup with the IRQs disabled
      pinctrl: single: Fix potential division by zero

commit 0e15c3c75a28b10bac7b3ad7627fd6b458623283
Merge: 2df2adc3e69d 39cefc5f6cd2
Author: Linus Torvalds <[email protected]>
Date:   Fri Dec 2 16:04:53 2022 -0800

    Merge tag 'riscv-for-linus-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

    Pull RISC-V fixes from Palmer Dabbelt:

     - build fix for the NR_CPUS Kconfig SBI version dependency

     - fixes to early memory initialization, to fix page permissions in EFI
       and post-initmem-free

     - build fix for the VDSO, to avoid trying to profile the VDSO functions

     - fixes for kexec crash handling, to fix multi-core and interrupt
       related initialization inside the crash kernel

     - fix for a race condition when handling multiple concurrect kernel
       stack overflows

    * tag 'riscv-for-linus-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
      riscv: kexec: Fixup crash_smp_send_stop without multi cores
      riscv: kexec: Fixup irq controller broken in kexec crash path
      riscv: mm: Proper page permissions after initmem free
      riscv: vdso: fix section overlapping under some conditions
      riscv: fix race when vmap stack overflow
      riscv: Sync efi page table's kernel mappings before switching
      riscv: Fix NR_CPUS range conditions

commit 2df2adc3e69d32751eb534ed55c591854260c4a3
Merge: f66f62f83dba dd30dcfa7a74
Author: Linus Torvalds <[email protected]>
Date:   Fri Dec 2 15:58:07 2022 -0800

    Merge tag 'mmc-v6.1-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

    Pull MMC fixes from Ulf Hansson:
     "MMC core:
       - Fix ambiguous TRIM and DISCARD args
       - Fix removal of debugfs file for mmc_test

      MMC host:
       - mtk-sd: Add missing clk_disable_unprepare() in an error path
       - sdhci: Fix I/O voltage switch delay for UHS-I SD cards
       - sdhci-esdhc-imx: Fix CQHCI exit halt state check
       - sdhci-sprd: Fix voltage switch"

    * tag 'mmc-v6.1-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
      mmc: sdhci-sprd: Fix no reset data and command after voltage switch
      mmc: sdhci: Fix voltage switch delay
      mmc: mtk-sd: Fix missing clk_disable_unprepare in msdc_of_clock_parse()
      mmc: mmc_test: Fix removal of debugfs file
      mmc: sdhci-esdhc-imx: correct CQHCI exit halt state check
      mmc: core: Fix ambiguous TRIM and DISCARD arg

commit f66f62f83dba3ec4237c3dd7a95b776824ac91a0
Merge: 66065157420c 4bedbbd782eb
Author: Linus Torvalds <[email protected]>
Date:   Fri Dec 2 15:54:12 2022 -0800

    Merge tag 'iommu-fixes-v6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

    Pull iommu fixes from Joerg Roedel:
     "Intel VT-d fixes:

       - IO/TLB flush fix

       - Various pci_dev refcount fixes"

    * tag 'iommu-fixes-v6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
      iommu/vt-d: Fix PCI device refcount leak in dmar_dev_scope_init()
      iommu/vt-d: Fix PCI device refcount leak in has_external_pci()
      iommu/vt-d: Fix PCI device refcount leak in prq_event_thread()
      iommu/vt-d: Add a fix for devices need extra dtlb flush

commit 66065157420c5b9b3f078f43d313c153e1ff7f83
Author: Pawan Gupta <[email protected]>
Date:   Wed Nov 30 07:25:51 2022 -0800

    x86/bugs: Make sure MSR_SPEC_CTRL is updated properly upon resume from S3

    The "force" argument to write_spec_ctrl_current() is currently ambiguous
    as it does not guarantee the MSR write. This is due to the optimization
    that writes to the MSR happen only when the new value differs from the
    cached value.

    This is fine in most cases, but breaks for S3 resume when the cached MSR
    value gets out of sync with the hardware MSR value due to S3 resetting
    it.

    When x86_spec_ctrl_current is same as x86_spec_ctrl_base, the MSR write
    is skipped. Which results in SPEC_CTRL mitigations not getting restored.

    Move the MSR write from write_spec_ctrl_current() to a new function that
    unconditionally writes to the MSR. Update the callers accordingly and
    rename functions.

      [ bp: Rework a bit. ]

    Fixes: caa0ff24d5d0 ("x86/bugs: Keep a per-CPU IA32_SPEC_CTRL value")
    Suggested-by: Borislav Petkov <[email protected]>
    Signed-off-by: Pawan Gupta <[email protected]>
    Signed-off-by: Borislav Petkov (AMD) <[email protected]>
    Reviewed-by: Thomas Gleixner <[email protected]>
    Cc: <[email protected]>
    Link: https://lore.kernel.org/r/806d39b0bfec2fe8f50dc5446dff20f5bb24a959.1669821572.git.pawan.kumar.gupta@linux.intel.com
    Signed-off-by: Linus Torvalds <[email protected]>

commit 8c9a59939deb4bfafdc451100c03d1e848b4169b
Author: Zhang Xiaoxu <[email protected]>
Date:   Fri Dec 2 15:37:46 2022 -0800

    Input: raydium_ts_i2c - fix memory leak in raydium_i2c_send()

    There is a kmemleak when test the raydium_i2c_ts with bpf mock device:

      unreferenced object 0xffff88812d3675a0 (size 8):
        comm "python3", pid 349, jiffies 4294741067 (age 95.695s)
        hex dump (first 8 bytes):
          11 0e 10 c0 01 00 04 00                          ........
        backtrace:
          [<0000000068427125>] __kmalloc+0x46/0x1b0
          [<0000000090180f91>] raydium_i2c_send+0xd4/0x2bf [raydium_i2c_ts]
          [<000000006e631aee>] raydium_i2c_initialize.cold+0xbc/0x3e4 [raydium_i2c_ts]
          [<00000000dc6fcf38>] raydium_i2c_probe+0x3cd/0x6bc [raydium_i2c_ts]
          [<00000000a310de16>] i2c_device_probe+0x651/0x680
          [<00000000f5a96bf3>] really_probe+0x17c/0x3f0
          [<00000000096ba499>] __driver_probe_device+0xe3/0x170
          [<00000000c5acb4d9>] driver_probe_device+0x49/0x120
          [<00000000264fe082>] __device_attach_driver+0xf7/0x150
          [<00000000f919423c>] bus_for_each_drv+0x114/0x180
          [<00000000e067feca>] __device_attach+0x1e5/0x2d0
          [<0000000054301fc2>] bus_probe_device+0x126/0x140
          [<00000000aad93b22>] device_add+0x810/0x1130
          [<00000000c086a53f>] i2c_new_client_device+0x352/0x4e0
          [<000000003c2c248c>] of_i2c_register_device+0xf1/0x110
          [<00000000ffec4177>] of_i2c_notify+0x100/0x160
      unreferenced object 0xffff88812d3675c8 (size 8):
        comm "python3", pid 349, jiffies 4294741070 (age 95.692s)
        hex dump (first 8 bytes):
          22 00 36 2d 81 88 ff ff                          ".6-....
        backtrace:
          [<0000000068427125>] __kmalloc+0x46/0x1b0
          [<0000000090180f91>] raydium_i2c_send+0xd4/0x2bf [raydium_i2c_ts]
          [<000000001d5c9620>] raydium_i2c_initialize.cold+0x223/0x3e4 [raydium_i2c_ts]
          [<00000000dc6fcf38>] raydium_i2c_probe+0x3cd/0x6bc [raydium_i2c_ts]
          [<00000000a310de16>] i2c_device_probe+0x651/0x680
          [<00000000f5a96bf3>] really_probe+0x17c/0x3f0
          [<00000000096ba499>] __driver_probe_device+0xe3/0x170
          [<00000000c5acb4d9>] driver_probe_device+0x49/0x120
          [<00000000264fe082>] __device_attach_driver+0xf7/0x150
          [<00000000f919423c>] bus_for_each_drv+0x114/0x180
          [<00000000e067feca>] __device_attach+0x1e5/0x2d0
          [<0000000054301fc2>] bus_probe_device+0x126/0x140
          [<00000000aad93b22>] device_add+0x810/0x1130
          [<00000000c086a53f>] i2c_new_client_device+0x352/0x4e0
          [<000000003c2c248c>] of_i2c_register_device+0xf1/0x110
          [<00000000ffec4177>] of_i2c_notify+0x100/0x160

    After BANK_SWITCH command from i2c BUS, no matter success or error
    happened, the tx_buf should be freed.

    Fixes: 3b384bd6c3f2 ("Input: raydium_ts_i2c - do not split tx transactions")
    Signed-off-by: Zhang Xiaoxu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Cc: [email protected]
    Signed-off-by: Dmitry Torokhov <[email protected]>

commit a1e9185d20b56af04022d2e656802254f4ea47eb
Merge: c290db013742 b47068b4aa53
Author: Linus Torvalds <[email protected]>
Date:   Fri Dec 2 15:40:35 2022 -0800

    Merge tag 'sound-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

    Pull sound fixes from Takashi Iwai:
     "Likely the last piece for 6.1; the only significant fixes are ASoC
      core ops fixes, while others are device-specific (rather minor) fixes
      in ASoC and FireWire drivers.

      All appear safe enough to take as a late stage material"

    * tag 'sound-6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
      ALSA: dice: fix regression for Lexicon I-ONIX FW810S
      ASoC: cs42l51: Correct PGA Volume minimum value
      ASoC: ops: Correct bounds check for second channel on SX controls
      ASoC: tlv320adc3xxx: Fix build error for implicit function declaration
      ASoC: ops: Check bounds for second channel in snd_soc_put_volsw_sx()
      ASoC: ops: Fix bounds check for _sx controls
      ASoC: fsl_micfil: explicitly clear CHnF flags
      ASoC: fsl_micfil: explicitly clear software reset bit

commit c290db013742e98fe5b64073bc2dd8c8a2ac9e4c
Merge: bdaa78c6aa86 c082fbd687ad
Author: Linus Torvalds <[email protected]>
Date:   Fri Dec 2 15:35:21 2022 -0800

    Merge tag 'drm-fixes-2022-12-02' of git://anongit.freedesktop.org/drm/drm

    Pull drm fixes from Dave Airlie:
     "Things do seem to have finally settled down, just four i915 and one
      amdgpu this week. Probably won't have much for next week if you do
      push rc8 out.

      i915:
       - Fix dram info readout
       - Remove non-existent pipes from bigjoiner pipe mask
       - Fix negative value passed as remaining time
       - Never return 0 if not all requests retired

      amdgpu:
       - VCN fix for vangogh"

    * tag 'drm-fixes-2022-12-02' of git://anongit.freedesktop.org/drm/drm:
      drm/amdgpu: enable Vangogh VCN indirect sram mode
      drm/i915: Never return 0 if not all requests retired
      drm/i915: Fix negative value passed as remaining time
      drm/i915: Remove non-existent pipes from bigjoiner pipe mask
      drm/i915/mtl: Fix dram info readout

commit bdaa78c6aa861f0e8c612a0b2272423d92f0071c
Merge: 6647e76ab623 1d351f189434
Author: Linus Torvalds <[email protected]>
Date:   Fri Dec 2 13:39:38 2022 -0800

    Merge tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

    Pull misc hotfixes from Andrew Morton:
     "15 hotfixes,  11 marked cc:stable.

      Only three or four of the latter address post-6.0 issues, which is
      hopefully a sign that things are converging"

    * tag 'mm-hotfixes-stable-2022-12-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
      revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible"
      Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled
      drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame
      mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths
      mm/khugepaged: fix GUP-fast interaction by sending IPI
      mm/khugepaged: take the right locks for page table retraction
      mm: migrate: fix THP's mapcount on isolation
      mm: introduce arch_has_hw_nonleaf_pmd_young()
      mm: add dummy pmd_young() for architectures not having it
      mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes()
      tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep"
      nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()
      hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing
      madvise: use zap_page_range_single for madvise dontneed
      mm: replace VM_WARN_ON to pr_warn if the node is offline with __GFP_THISNODE

commit 6647e76ab623b2b3fb2efe03a86e9c9046c52c33
Author: Linus Torvalds <[email protected]>
Date:   Wed Nov 30 16:10:52 2022 -0800

    v4l2: don't fall back to follow_pfn() if pin_user_pages_fast() fails

    The V4L2_MEMORY_USERPTR interface is long deprecated and shouldn't be
    used (and is discouraged for any modern v4l drivers).  And Seth Jenkins
    points out that the fallback to VM_PFNMAP/VM_IO is fundamentally racy
    and dangerous.

    Note that it's not even a case that should trigger, since any normal
    user pointer logic ends up just using the pin_user_pages_fast() call
    that does the proper page reference counting.  That's not the problem
    case, only if you try to use special device mappings do you have any
    issues.

    Normally I'd just remove this during the merge window, but since Seth
    pointed out the problem cases, we really want to know as soon as
    possible if there are actually any users of this odd special case of a
    legacy interface.  Neither Hans nor Mauro seem to think that such
    mis-uses of the old legacy interface should exist.  As Mauro says:

     "See, V4L2 has actually 4 streaming APIs:
            - Kernel-allocated mmap (usually referred simply as just mmap);
            - USERPTR mmap;
            - read();
            - dmabuf;

      The USERPTR is one of the oldest way to use it, coming from V4L
      version 1 times, and by far the least used one"

    And Hans chimed in on the USERPTR interface:

     "To be honest, I wouldn't mind if it goes away completely, but that's a
      bit of a pipe dream right now"

    but while removing this legacy interface entirely may be a pipe dream we
    can at least try to remove the unlikely (and actively broken) case of
    using special device mappings for USERPTR accesses.

    This replaces it with a WARN_ONCE() that we can remove once we've
    hopefully confirmed that no actual users exist.

    NOTE! Longer term, this means that a 'struct frame_vector' only ever
    contains proper page pointers, and all the games we have with converting
    them to pages can go away (grep for 'frame_vector_to_pages()' and the
    uses of 'vec->is_pfns').  But this is just the first step, to verify
    that this code really is all dead, and do so as quickly as possible.

    Reported-by: Seth Jenkins <[email protected]>
    Acked-by: Hans Verkuil <[email protected]>
    Acked-by: Mauro Carvalho Chehab <[email protected]>
    Cc: David Hildenbrand <[email protected]>
    Cc: Jan Kara <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit d0f411c0b9bdef85f647e15a2fcc790b29891f2c
Merge: 7d4a93176e01 899d2a05dc14
Author: Jens Axboe <[email protected]>
Date:   Fri Dec 2 08:01:06 2022 -0700

    Merge tag 'nvme-6.1-2022-01-02' of git://git.infradead.org/nvme into block-6.1

    Pull NVMe fixes from Christoph:

    "nvme fixes for Linux 6.1

     - fix SRCU protection of nvme_ns_head list (Caleb Sander)
     - clear the prp2 field when not used (Lei Rao)"

    * tag 'nvme-6.1-2022-01-02' of git://git.infradead.org/nvme:
      nvme: fix SRCU protection of nvme_ns_head list
      nvme-pci: clear the prp2 field when not used

commit 4bedbbd782ebbe7287231fea862c158d4f08a9e3
Author: Xiongfeng Wang <[email protected]>
Date:   Thu Dec 1 12:01:27 2022 +0800

    iommu/vt-d: Fix PCI device refcount leak in dmar_dev_scope_init()

    for_each_pci_dev() is implemented by pci_get_device(). The comment of
    pci_get_device() says that it will increase the reference count for the
    returned pci_dev and also decrease the reference count for the input
    pci_dev @from if it is not NULL.

    If we break for_each_pci_dev() loop with pdev not NULL, we need to call
    pci_dev_put() to decrease the reference count. Add the missing
    pci_dev_put() for the error path to avoid reference count leak.

    Fixes: 2e4552893038 ("iommu/vt-d: Unify the way to process DMAR device scope array")
    Signed-off-by: Xiongfeng Wang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Signed-off-by: Joerg Roedel <[email protected]>

commit afca9e19cc720bfafc75dc5ce429c185ca93f31d
Author: Xiongfeng Wang <[email protected]>
Date:   Thu Dec 1 12:01:26 2022 +0800

    iommu/vt-d: Fix PCI device refcount leak in has_external_pci()

    for_each_pci_dev() is implemented by pci_get_device(). The comment of
    pci_get_device() says that it will increase the reference count for the
    returned pci_dev and also decrease the reference count for the input
    pci_dev @from if it is not NULL.

    If we break for_each_pci_dev() loop with pdev not NULL, we need to call
    pci_dev_put() to decrease the reference count. Add the missing
    pci_dev_put() before 'return true' to avoid reference count leak.

    Fixes: 89a6079df791 ("iommu/vt-d: Force IOMMU on for platform opt in hint")
    Signed-off-by: Xiongfeng Wang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Signed-off-by: Joerg Roedel <[email protected]>

commit 6927d352380797ddbee18631491ec428741696e2
Author: Yang Yingliang <[email protected]>
Date:   Thu Dec 1 12:01:25 2022 +0800

    iommu/vt-d: Fix PCI device refcount leak in prq_event_thread()

    As comment of pci_get_domain_bus_and_slot() says, it returns a pci device
    with refcount increment, when finish using it, the caller must decrease
    the reference count by calling pci_dev_put(). So call pci_dev_put() after
    using the 'pdev' to avoid refcount leak.

    Besides, if the 'pdev' is null or intel_svm_prq_report() returns error,
    there is no need to trace this fault.

    Fixes: 06f4b8d09dba ("iommu/vt-d: Remove unnecessary SVA data accesses in page fault path")
    Suggested-by: Lu Baolu <[email protected]>
    Signed-off-by: Yang Yingliang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Signed-off-by: Joerg Roedel <[email protected]>

commit e65a6897be5e4939d477c4969a05e12d90b08409
Author: Jacob Pan <[email protected]>
Date:   Thu Dec 1 12:01:24 2022 +0800

    iommu/vt-d: Add a fix for devices need extra dtlb flush

    QAT devices on Intel Sapphire Rapids and Emerald Rapids have a defect in
    address translation service (ATS). These devices may inadvertently issue
    ATS invalidation completion before posted writes initiated with
    translated address that utilized translations matching the invalidation
    address range, violating the invalidation completion ordering.

    This patch adds an extra device TLB invalidation for the affected devices,
    it is needed to ensure no more posted writes with translated address
    following the invalidation completion. Therefore, the ordering is
    preserved and data-corruption is prevented.

    Device TLBs are invalidated under the following six conditions:
    1. Device driver does DMA API unmap IOVA
    2. Device driver unbind a PASID from a process, sva_unbind_device()
    3. PASID is torn down, after PASID cache is flushed. e.g. process
    exit_mmap() due to crash
    4. Under SVA usage, called by mmu_notifier.invalidate_range() where
    VM has to free pages that were unmapped
    5. userspace driver unmaps a DMA buffer
    6. Cache invalidation in vSVA usage (upcoming)

    For #1 and #2, device drivers are responsible for stopping DMA traffic
    before unmap/unbind. For #3, iommu driver gets mmu_notifier to
    invalidate TLB the same way as normal user unmap which will do an extra
    invalidation. The dTLB invalidation after PASID cache flush does not
    need an extra invalidation.

    Therefore, we only need to deal with #4 and #5 in this patch. #1 is also
    covered by this patch due to common code path with #5.

    Tested-by: Yuzhang Luo <[email protected]>
    Reviewed-by: Ashok Raj <[email protected]>
    Reviewed-by: Kevin Tian <[email protected]>
    Signed-off-by: Jacob Pan <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Lu Baolu <[email protected]>
    Signed-off-by: Joerg Roedel <[email protected]>

commit c082fbd687ad70a92e0a8be486a7555a66f03079
Merge: 65a388250e39 9a8cc8cabc1e
Author: Dave Airlie <[email protected]>
Date:   Fri Dec 2 09:12:46 2022 +1000

    Merge tag 'amd-drm-fixes-6.1-2022-12-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

    amd-drm-fixes-6.1-2022-12-01:

    amdgpu:
    - VCN fix for vangogh

    Signed-off-by: Dave Airlie <[email protected]>
    From: Alex Deucher <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

commit d36678f7905cbd1dc55a8a96e066dafd749d4600
Author: Andrew Lunn <[email protected]>
Date:   Thu Nov 10 00:59:02 2022 +0100

    i2c: imx: Only DMA messages with I2C_M_DMA_SAFE flag set

    Recent changes to the DMA code has resulting in the IMX driver failing
    I2C transfers when the buffer has been vmalloc. Only perform DMA
    transfers if the message has the I2C_M_DMA_SAFE flag set, indicating
    the client is providing a buffer which is DMA safe.

    This is a minimal fix for stable. The I2C core provides helpers to
    allocate a bounce buffer. For a fuller fix the master should make use
    of these helpers.

    Fixes: 4544b9f25e70 ("dma-mapping: Add vmap checks to dma_map_single()")
    Signed-off-by: Andrew Lunn <[email protected]>
    Acked-by: Oleksij Rempel <[email protected]>
    Signed-off-by: Wolfram Sang <[email protected]>

commit 7d8ccf4f117d082156e842d959f634efcf203cef
Author: Wang Yufen <[email protected]>
Date:   Mon Nov 21 18:17:52 2022 +0800

    i2c: qcom-geni: fix error return code in geni_i2c_gpi_xfer

    Fix to return a negative error code from the gi2c->err instead of
    0.

    Fixes: d8703554f4de ("i2c: qcom-geni: Add support for GPI DMA")
    Signed-off-by: Wang Yufen <[email protected]>
    Reviewed-by: Tommaso Merciai <[email protected]>
    Signed-off-by: Wolfram Sang <[email protected]>

commit 8bfd4ec726945cec35ee5cafebc1c55b83d5a9fb
Author: Carsten Haitzler <[email protected]>
Date:   Mon Nov 28 10:51:58 2022 +0000

    i2c: cadence: Fix regression with bus recovery

    Commit "i2c: cadence: Add standard bus recovery support" breaks for i2c
    devices that have no pinctrl defined. There is no requirement for this
    to exist in the DT. This has worked perfectly well without this before in
    at least 1 real usage case on hardware (Mali Komeda DPU, Cadence i2c to
    talk to a tda99xx phy). Adding the requirement to have pinctrl set up in
    the device tree (or otherwise be found) is a regression where the whole
    i2c device is lost entirely (in this case dropping entire devices which
    then leads to the drm display stack unable to find the phy for display
    output, thus having no drm display device and so on down the chain).

    This converts the above commit to an enhancement if pinctrl can be found
    for the i2c device, providing a timeout on read with recovery, but if not,
    do what used to be done rather than a fatal loss of a device.

    This restores the mentioned display devices to their working state again.

    Fixes: 58b924241d0a ("i2c: cadence: Add standard bus recovery support")
    Signed-off-by: Carsten Haitzler <[email protected]>
    Reviewed-by: Shubhrajyoti Datta <[email protected]>
    Reviewed-by: Michael Grzeschik <[email protected]>
    Acked-by: Michal Simek <[email protected]>
    [wsa: added braces to else-branch]
    Signed-off-by: Wolfram Sang <[email protected]>

commit 65a388250e391b7127efd6751f64f3e4955e43a0
Merge: b7b275e60bcd 12b8b046e4c9
Author: Dave Airlie <[email protected]>
Date:   Fri Dec 2 07:34:27 2022 +1000

    Merge tag 'drm-intel-fixes-2022-12-01' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

    - Fix dram info readout (Radhakrishna Sripada)
    - Remove non-existent pipes from bigjoiner pipe mask (Ville Syrjälä)
    - Fix negative value passed as remaining time (Janusz Krzysztofik)
    - Never return 0 if not all requests retired (Janusz Krzysztofik)

    Signed-off-by: Dave Airlie <[email protected]>
    From: Tvrtko Ursulin <[email protected]>
    Link: https://patchwork.freedesktop.org/patch/msgid/Y4hp+a3TJ13t2ZA1@tursulin-desk

commit a4412fdd49dc011bcc2c0d81ac4cab7457092650
Author: Steven Rostedt (Google) <[email protected]>
Date:   Mon Nov 21 10:44:03 2022 -0500

    error-injection: Add prompt for function error injection

    The config to be able to inject error codes into any function annotated
    with ALLOW_ERROR_INJECTION() is enabled when FUNCTION_ERROR_INJECTION is
    enabled.  But unfortunately, this is always enabled on x86 when KPROBES
    is enabled, and there's no way to turn it off.

    As kprobes is useful for observability of the kernel, it is useful to
    have it enabled in production environments.  But error injection should
    be avoided.  Add a prompt to the config to allow it to be disabled even
    when kprobes is enabled, and get rid of the "def_bool y".

    This is a kernel debug feature (it's in Kconfig.debug), and should have
    never been something enabled by default.

    Cc: [email protected]
    Fixes: 540adea3809f6 ("error-injection: Separate error-injection from kprobe")
    Signed-off-by: Steven Rostedt (Google) <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 9a8cc8cabc1e351614fd7f9e774757a5143b6fe8
Author: Leo Liu <[email protected]>
Date:   Tue Nov 29 18:53:18 2022 -0500

    drm/amdgpu: enable Vangogh VCN indirect sram mode

    So that uses PSP to initialize HW.

    Fixes: 0c2c02b66c672e ("drm/amdgpu/vcn: add firmware support for dimgrey_cavefish")
    Signed-off-by: Leo Liu <[email protected]>
    Reviewed-by: James Zhu <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]

commit 39cefc5f6cd25d555e0455b24810e9aff365b8d6
Merge: d556a9aeb62a 7e1864332fbc
Author: Palmer Dabbelt <[email protected]>
Date:   Thu Dec 1 11:38:39 2022 -0800

    RISC-V: Fix a race condition during kernel stack overflow

    This fixes a concrete bug but is also the basis for some cleanup work,
    so I'm merging it based on the offending commit in order to minimize
    future conflicts.

    * commit '7e1864332fbc1b993659eab7974da9fe8bf8c128':
      riscv: fix race when vmap stack overflow

commit 355479c70a489415ef6e2624e514f8f15a40861b
Merge: e214dd935bf1 7572ac3c979d
Author: Linus Torvalds <[email protected]>
Date:   Thu Dec 1 11:25:11 2022 -0800

    Merge tag 'efi-fixes-for-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

    Pull EFI fix from Ard Biesheuvel:
     "A single revert for some code that I added during this cycle. The code
      is not wrong, but it should be a bit more careful about how to handle
      the shadow call stack pointer, so it is better to revert it for now
      and bring it back later in improved form.

      Summary:

       - Revert runtime service sync exception recovery on arm64"

    * tag 'efi-fixes-for-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
      arm64: efi: Revert "Recover from synchronous exceptions ..."

commit e214dd935bf154af4c2d5013b16a283d592ccea5
Merge: 063c0e773ab3 9bdc112be727
Author: Linus Torvalds <[email protected]>
Date:   Thu Dec 1 11:10:25 2022 -0800

    Merge tag 'hwmon-for-v6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

    Pull hwmon fixes from Guenter Roeck:

     - Fix refcount leak and NULL pointer access in coretemp driver

     - Fix UAF in ibmpex driver

     - Add missing pci_disable_device() to i5500_temp driver

     - Fix calculation in ina3221 driver

     - Fix temperature scaling in ltc2947 driver

     - Check result of devm_kcalloc call in asus-ec-sensors driver

    * tag 'hwmon-for-v6.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
      hwmon: (asus-ec-sensors) Add checks for devm_kcalloc
      hwmon: (coretemp) fix pci device refcount leak in nv1a_ram_new()
      hwmon: (coretemp) Check for null before removing sysfs attrs
      hwmon: (ibmpex) Fix possible UAF when ibmpex_register_bmc() fails
      hwmon: (i5500_temp) fix missing pci_disable_device()
      hwmon: (ina3221) Fix shunt sum critical calculation
      hwmon: (ltc2947) fix temperature scaling

commit 9bdc112be727cf1ba65be79541147f960c3349d8
Author: Yuan Can <[email protected]>
Date:   Fri Nov 25 01:43:29 2022 +0000

    hwmon: (asus-ec-sensors) Add checks for devm_kcalloc

    As the devm_kcalloc may return NULL, the return value needs to be checked
    to avoid NULL poineter dereference.

    Fixes: d0ddfd241e57 ("hwmon: (asus-ec-sensors) add driver for ASUS EC")
    Signed-off-by: Yuan Can <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Guenter Roeck <[email protected]>

commit 7dec14537c5906b8bf40fd6fd6d9c3850f8df11d
Author: Yang Yingliang <[email protected]>
Date:   Fri Nov 18 17:33:03 2022 +0800

    hwmon: (coretemp) fix pci device refcount leak in nv1a_ram_new()

    As comment of pci_get_domain_bus_and_slot() says, it returns
    a pci device with refcount increment, when finish using it,
    the caller must decrement the reference count by calling
    pci_dev_put(). So call it after using to avoid refcount leak.

    Fixes: 14513ee696a0 ("hwmon: (coretemp) Use PCI host bridge ID to identify CPU if necessary")
    Signed-off-by: Yang Yingliang <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Guenter Roeck <[email protected]>

commit a89ff5f5cc64b9fe7a992cf56988fd36f56ca82a
Author: Phil Auld <[email protected]>
Date:   Thu Nov 17 11:23:13 2022 -0500

    hwmon: (coretemp) Check for null before removing sysfs attrs

    If coretemp_add_core() gets an error then pdata->core_data[indx]
    is already NULL and has been kfreed. Don't pass that to
    sysfs_remove_group() as that will crash in sysfs_remove_group().

    [Shortened for readability]
    [91854.020159] sysfs: cannot create duplicate filename '/devices/platform/coretemp.0/hwmon/hwmon2/temp20_label'
    <cpu offline>
    [91855.126115] BUG: kernel NULL pointer dereference, address: 0000000000000188
    [91855.165103] #PF: supervisor read access in kernel mode
    [91855.194506] #PF: error_code(0x0000) - not-present page
    [91855.224445] PGD 0 P4D 0
    [91855.238508] Oops: 0000 [#1] PREEMPT SMP PTI
    ...
    [91855.342716] RIP: 0010:sysfs_remove_group+0xc/0x80
    ...
    [91855.796571] Call Trace:
    [91855.810524]  coretemp_cpu_offline+0x12b/0x1dd [coretemp]
    [91855.841738]  ? coretemp_cpu_online+0x180/0x180 [coretemp]
    [91855.871107]  cpuhp_invoke_callback+0x105/0x4b0
    [91855.893432]  cpuhp_thread_fun+0x8e/0x150
    ...

    Fix this by checking for NULL first.

    Signed-off-by: Phil Auld <[email protected]>
    Cc: [email protected]
    Cc: Fenghua Yu <[email protected]>
    Cc: Jean Delvare <[email protected]>
    Cc: Guenter Roeck <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    Fixes: 199e0de7f5df3 ("hwmon: (coretemp) Merge pkgtemp with coretemp")
    Signed-off-by: Guenter Roeck <[email protected]>

commit 7572ac3c979d4d0fb42d73a72d2608656516ff4f
Author: Ard Biesheuvel <[email protected]>
Date:   Wed Nov 30 17:37:17 2022 +0100

    arm64: efi: Revert "Recover from synchronous exceptions ..."

    This reverts commit 23715a26c8d81291, which introduced some code in
    assembler that manipulates both the ordinary and the shadow call stack
    pointer in a way that could potentially be taken advantage of. So let's
    revert it, and do a better job the next time around.

    Signed-off-by: Ard Biesheuvel <[email protected]>

commit d9f15a9de44affe733e34f93bc184945ba277e6d
Author: Conor Dooley <[email protected]>
Date:   Tue Nov 22 12:16:21 2022 +0000

    Revert "clocksource/drivers/riscv: Events are stopped during CPU suspend"

    This reverts commit 232ccac1bd9b5bfe73895f527c08623e7fa0752d.

    On the subject of suspend, the RISC-V SBI spec states:

      This does not cover whether any given events actually reach the hart or
      not, just what the hart will do if it receives an event. On PolarFire
      SoC, and potentially other SiFive based implementations, events from the
      RISC-V timer do reach a hart during suspend. This is not the case for the
      implementation on the Allwinner D1 - there timer events are not received
      during suspend.

    To fix this, the CLOCK_EVT_FEAT_C3STOP (mis)feature was enabled for the
    timer driver - but this has broken both RCU stall detection and timers
    generally on PolarFire SoC and potentially other SiFive based
    implementations.

    If an AXI read to the PCIe controller on PolarFire SoC times out, the
    system will stall, however, with CLOCK_EVT_FEAT_C3STOP active, the system
    just locks up without RCU stalling:

    	io scheduler mq-deadline registered
    	io scheduler kyber registered
    	microchip-pcie 2000000000.pcie: host bridge /soc/pcie@2000000000 ranges:
    	microchip-pcie 2000000000.pcie:      MEM 0x2008000000..0x2087ffffff -> 0x0008000000
    	microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
    	microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
    	microchip-pcie 2000000000.pcie: axi read request error
    	microchip-pcie 2000000000.pcie: axi read timeout
    	microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
    	microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
    	microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
    	microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
    	microchip-pcie 2000000000.pcie: sec error in pcie2axi buffer
    	microchip-pcie 2000000000.pcie: ded error in pcie2axi buffer
    	Freeing initrd memory: 7332K

    Similarly issues were reported with clock_nanosleep() - with a test app
    that sleeps each cpu for 6, 5, 4, 3 ms respectively, HZ=250 & the blamed
    commit in place, the sleep times are rounded up to the next jiffy:

    == CPU: 1 ==      == CPU: 2 ==      == CPU: 3 ==      == CPU: 4 ==
    Mean: 7.974992    Mean: 7.976534    Mean: 7.962591    Mean: 3.952179
    Std Dev: 0.154374 Std Dev: 0.156082 Std Dev: 0.171018 Std Dev: 0.076193
    Hi: 9.472000      Hi: 10.495000     Hi: 8.864000      Hi: 4.736000
    Lo: 6.087000      Lo: 6.380000      Lo: 4.872000      Lo: 3.403000
    Samples: 521      Samples: 521      Samples: 521      Samples: 521

    Fortunately, the D1 has a second timer, which is "currently used in
    preference to the RISC-V/SBI timer driver" so a revert here does not
    hurt operation of D1 in its current form.

    Ultimately, a DeviceTree property (or node) will be added to encode the
    behaviour of the timers, but until then revert the addition of
    CLOCK_EVT_FEAT_C3STOP.

    Fixes: 232ccac1bd9b ("clocksource/drivers/riscv: Events are stopped during CPU suspend")
    Signed-off-by: Conor Dooley <[email protected]>
    Signed-off-by: Thomas Gleixner <[email protected]>
    Reviewed-by: Palmer Dabbelt <[email protected]>
    Acked-by: Palmer Dabbelt <[email protected]>
    Acked-by: Samuel Holland <[email protected]>
    Link: https://lore.kernel.org/linux-riscv/YzYTNQRxLr7Q9JR0@spud/
    Link: https://github.com/riscv-non-isa/riscv-sbi-doc/issues/98/
    Link: https://lore.kernel.org/linux-riscv/[email protected]/
    Link: https://lore.kernel.org/r/[email protected]

commit dd30dcfa7a74a06f8dcdab260d8d5adf32f17333
Author: Wenchao Chen <[email protected]>
Date:   Wed Nov 30 20:13:28 2022 +0800

    mmc: sdhci-sprd: Fix no reset data and command after voltage switch

    After switching the voltage, no reset data and command will cause
    CMD2 timeout.

    Fixes: 29ca763fc26f ("mmc: sdhci-sprd: Add pin control support for voltage switch")
    Signed-off-by: Wenchao Chen <[email protected]>
    Acked-by: Adrian Hunter <[email protected]>
    Reviewed-by: Baolin Wang <[email protected]>
    Cc: [email protected]
    Link: https://lore.kernel.org/r/[email protected]
    Signed-off-by: Ulf Hansson <[email protected]>

commit 063c0e773ab33103407a76051821777014b756fe
Merge: ef4d3ea40565 f6abcc21d943
Author: Linus Torvalds <[email protected]>
Date:   Wed Nov 30 15:46:46 2022 -0800

    Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

    Pull clk fixes from Stephen Boyd:
     "A set of clk driver fixes that resolve issues for various SoCs.

      Most of these are incorrect clk data, like bad parent descriptions.
      When the clk tree is improperly described things don't work, like USB
      and UFS controllers, because clk frequencies are wonky. Here are the
      extra details:

       - Fix the parent of UFS reference clks on Qualcomm SC8280XP so that
         UFS works properly

       - Fix the clk ID for USB on AT91 RM9200 so the USB driver continues
         to probe

       - Stop using of_device_get_match_data() on the wrong device for a
         Samsung Exynos driver so it gets the proper clk data

       - Fix ExynosAutov9 binding

       - Fix the parent of the div4 clk on Exynos7885

       - Stop calling runtime PM APIs from the Qualcomm GDSC driver directly
         as it leads to a lockdep splat and is just plain wrong because it
         violates runtime PM semantics by calling runtime PM APIs when the
         device has been runtime PM disabled"

    * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
      clk: qcom: gcc-sc8280xp: add cxo as parent for three ufs ref clks
      ARM: at91: rm9200: fix usb device clock id
      clk: samsung: Revert "clk: samsung: exynos-clkout: Use of_device_get_match_data()"
      dt-bindings: clock: exynosautov9: fix reference to CMU_FSYS1
      clk: qcom: gdsc: Remove direct runtime PM calls
      clk: samsung: exynos7885: Correct "div4" clock parents

commit 1d351f1894342c378b96bb9ed89f8debb1e24e9f
Author: Andrew Morton <[email protected]>
Date:   Mon Nov 28 13:21:40 2022 -0800

    revert "kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible"

    It causes build failures with unusual CC/HOSTCC combinations.

    Quoting
    https://lkml.kernel.org/r/[email protected]:

      HOSTCC  scripts/mod/modpost.o - due to target missing
    In file included from include/linux/string.h:5,
                     from scripts/mod/../../include/linux/license.h:5,
                     from scripts/mod/modpost.c:24:
    include/linux/compiler.h:246:10: fatal error: asm/rwonce.h: No such file or directory
      246 | #include <asm/rwonce.h>
          |          ^~~~~~~~~~~~~~
    compilation terminated.

    ...

    The problem is that HOSTCC is not necessarily the same compiler or even
    architecture as CC and pulling in <linux/compiler.h> or <asm/rwonce.h>
    files indirectly isn't a good idea then.

    My toolchain is providing HOSTCC = gcc (MacPorts) and CC = arm-linux-gnueabihf
    (built from gcc source) and all running on Darwin.

    If I change the include to <string.h> I can then "HOSTCC scripts/mod/modpost.c"
    but then it fails for "CC kernel/module/main.c" not finding <string.h>:

      CC      kernel/module/main.o - due to target missing
    In file included from kernel/module/main.c:43:0:
    ./include/linux/license.h:5:20: fatal error: string.h: No such file or directory
     #include <string.h>
                        ^
    compilation terminated.

    Reported-by: "H. Nikolaus Schaller" <[email protected]>
    Cc: Sam James <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>

commit 152fe65f300e1819d59b80477d3e0999b4d5d7d2
Author: Lee Jones <[email protected]>
Date:   Fri Nov 25 12:07:50 2022 +0000

    Kconfig.debug: provide a little extra FRAME_WARN leeway when KASAN is enabled

    When enabled, KASAN enlarges function's stack-frames.  Pushing quite a few
    over the current threshold.  This can mainly be seen on 32-bit
    architectures where the present limit (when !GCC) is a lowly 1024-Bytes.

    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Lee Jones <[email protected]>
    Acked-by: Arnd Bergmann <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: "Christian König" <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Leo Li <[email protected]>
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Maxime Ripard <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: Nick Desaulniers <[email protected]>
    Cc: "Pan, Xinhui" <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Thomas Zimmermann <[email protected]>
    Cc: Tom Rix <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>

commit 6f6cb1714365a07dbc66851879538df9f6969288
Author: Lee Jones <[email protected]>
Date:   Fri Nov 25 12:07:49 2022 +0000

    drm/amdgpu: temporarily disable broken Clang builds due to blown stack-frame

    Patch series "Fix a bunch of allmodconfig errors", v2.

    Since b339ec9c229aa ("kbuild: Only default to -Werror if COMPILE_TEST")
    WERROR now defaults to COMPILE_TEST meaning that it's enabled for
    allmodconfig builds.  This leads to some interesting build failures when
    using Clang, each resolved in this set.

    With this set applied, I am able to obtain a successful allmodconfig Arm
    build.

    This patch (of 2):

    calculate_bandwidth() is presently broken on all !(X86_64 || SPARC64 ||
    ARM64) architectures built with Clang (all released versions), whereby the
    stack frame gets blown up to well over 5k.  This would cause an immediate
    kernel panic on most architectures.  We'll revert this when the following
    bug report has been resolved:
    https://github.com/llvm/llvm-project/issues/41896.

    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Lee Jones <[email protected]>
    Suggested-by: Arnd Bergmann <[email protected]>
    Acked-by: Arnd Bergmann <[email protected]>
    Cc: Alex Deucher <[email protected]>
    Cc: "Christian König" <[email protected]>
    Cc: Daniel Vetter <[email protected]>
    Cc: David Airlie <[email protected]>
    Cc: Harry Wentland <[email protected]>
    Cc: Lee Jones <[email protected]>
    Cc: Leo Li <[email protected]>
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Maxime Ripard <[email protected]>
    Cc: Nathan Chancellor <[email protected]>
    Cc: Nick Desaulniers <[email protected]>
    Cc: "Pan, Xinhui" <[email protected]>
    Cc: Rodrigo Siqueira <[email protected]>
    Cc: Thomas Zimmermann <[email protected]>
    Cc: Tom Rix <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>

commit f268f6cf875f3220afc77bdd0bf1bb136eb54db9
Author: Jann Horn <[email protected]>
Date:   Fri Nov 25 22:37:14 2022 +0100

    mm/khugepaged: invoke MMU notifiers in shmem/file collapse paths

    Any codepath that zaps page table entries must invoke MMU notifiers to
    ensure that secondary MMUs (like KVM) don't keep accessing pages which
    aren't mapped anymore.  Secondary MMUs don't hold their own references to
    pages that are mirrored over, so failing to notify them can lead to page
    use-after-free.

    I'm marking this as addressing an issue introduced in commit f3f0e1d2150b
    ("khugepaged: add support of collapse for tmpfs/shmem pages"), but most of
    the security impact of this only came in commit 27e1f8273113 ("khugepaged:
    enable collapse pmd for pte-mapped THP"), which actually omitted flushes
    for the removal of present PTEs, not just for the removal of empty page
    tables.

    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
    Signed-off-by: Jann Horn <[email protected]>
    Acked-by: David Hildenbrand <[email protected]>
    Reviewed-by: Yang Shi <[email protected]>
    Cc: John Hubbard <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>

commit 2ba99c5e08812494bc57f319fb562f527d9bacd8
Author: Jann Horn <[email protected]>
Date:   Fri Nov 25 22:37:13 2022 +0100

    mm/khugepaged: fix GUP-fast interaction by sending IPI

    Since commit 70cbc3cc78a99 ("mm: gup: fix the fast GUP race against THP
    collapse"), the lockless_pages_from_mm() fastpath rechecks the pmd_t to
    ensure that the page table was not removed by khugepaged in between.

    However, lockless_pages_from_mm() still requires that the page table is
    not concurrently freed.  Fix it by sending IPIs (if the architecture uses
    semi-RCU-style page table freeing) before freeing/reusing page tables.

    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: ba76149f47d8 ("thp: khugepaged")
    Signed-off-by: Jann Horn <[email protected]>
    Reviewed-by: Yang Shi <[email protected]>
    Acked-by: David Hildenbrand <[email protected]>
    Cc: John Hubbard <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>

commit 8d3c106e19e8d251da31ff4cc7462e4565d65084
Author: Jann Horn <[email protected]>
Date:   Fri Nov 25 22:37:12 2022 +0100

    mm/khugepaged: take the right locks for page table retraction

    pagetable walks on address ranges mapped by VMAs can be done under the
    mmap lock, the lock of an anon_vma attached to the VMA, or the lock of the
    VMA's address_space.  Only one of these needs to be held, and it does not
    need to be held in exclusive mode.

    Under those circumstances, the rules for concurrent access to page table
    entries are:

     - Terminal page table entries (entries that don't point to another page
       table) can be arbitrarily changed under the page table lock, with the
       exception that they always need to be consistent for
       hardware page table walks and lockless_pages_from_mm().
       This includes that they can be changed into non-terminal entries.
     - Non-terminal page table entries (which point to another page table)
       can not be modified; readers are allowed to READ_ONCE() an entry, verify
       that it is non-terminal, and then assume that its value will stay as-is.

    Retracting a page table involves modifying a non-terminal entry, so
    page-table-level locks are insufficient to protect against concurrent page
    table traversal; it requires taking all the higher-level locks under which
    it is possible to start a page walk in the relevant range in exclusive
    mode.

    The collapse_huge_page() path for anonymous THP already follows this rule,
    but the shmem/file THP path was getting it wrong, making it possible for
    concurrent rmap-based operations to cause corruption.

    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 27e1f8273113 ("khugepaged: enable collapse pmd for pte-mapped THP")
    Signed-off-by: Jann Horn <[email protected]>
    Reviewed-by: Yang Shi <[email protected]>
    Acked-by: David Hildenbrand <[email protected]>
    Cc: John Hubbard <[email protected]>
    Cc: Peter Xu <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>

commit 829ae0f81ce093d674ff2256f66a714753e9ce32
Author: Gavin Shan <[email protected]>
Date:   Thu Nov 24 17:55:23 2022 +0800

    mm: migrate: fix THP's mapcount on isolation

    The issue is reported when removing memory through virtio_mem device.  The
    transparent huge page, experienced copy-on-write fault, is wrongly
    regarded as pinned.  The transparent huge page is escaped from being
    isolated in isolate_migratepages_block().  The transparent huge page can't
    be migrated and the corresponding memory block can't be put into offline
    state.

    Fix it by replacing page_mapcount() with total_mapcount().  With this, the
    transparent huge page can be isolated and migrated, and the memory block
    can be put into offline state.  Besides, The page's refcount is increased
    a bit earlier to avoid the page is released when the check is executed.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: 1da2f328fa64 ("mm,thp,compaction,cma: allow THP migration for CMA allocations")
    Signed-off-by: Gavin Shan <[email protected]>
    Reported-by: Zhenyu Zhang <[email protected]>
    Tested-by: Zhenyu Zhang <[email protected]>
    Suggested-by: David Hildenbrand <[email protected]>
    Acked-by: David Hildenbrand <[email protected]>
    Cc: Alistair Popple <[email protected]>
    Cc: Hugh Dickins <[email protected]>
    Cc: Kirill A. Shutemov <[email protected]>
    Cc: Matthew Wilcox <[email protected]>
    Cc: William Kucharski <[email protected]>
    Cc: Zi Yan <[email protected]>
    Cc: <[email protected]>	[5.7+]
    Signed-off-by: Andrew Morton <[email protected]>

commit 4aaf269c768dbacd6268af73fda2ffccaa3f1d88
Author: Juergen Gross <[email protected]>
Date:   Wed Nov 23 07:45:10 2022 +0100

    mm: introduce arch_has_hw_nonleaf_pmd_young()

    When running as a Xen PV guests commit eed9a328aa1a ("mm: x86: add
    CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG") can cause a protection violation in
    pmdp_test_and_clear_young():

     BUG: unable to handle page fault for address: ffff8880083374d0
     #PF: supervisor write access in kernel mode
     #PF: error_code(0x0003) - permissions violation
     PGD 3026067 P4D 3026067 PUD 3027067 PMD 7fee5067 PTE 8010000008337065
     Oops: 0003 [#1] PREEMPT SMP NOPTI
     CPU: 7 PID: 158 Comm: kswapd0 Not tainted 6.1.0-rc5-20221118-doflr+ #1
     RIP: e030:pmdp_test_and_clear_young+0x25/0x40

    This happens because the Xen hypervisor can't emulate direct writes to
    page table entries other than PTEs.

    This can easily be fixed by introducing arch_has_hw_nonleaf_pmd_young()
    similar to arch_has_hw_pte_young() and test that instead of
    CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: eed9a328aa1a ("mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG")
    Signed-off-by: Juergen Gross <[email protected]>
    Reported-by: Sander Eikelenboom <[email protected]>
    Acked-by: Yu Zhao <[email protected]>
    Tested-by: Sander Eikelenboom <[email protected]>
    Acked-by: David Hildenbrand <[email protected]>	[core changes]
    Signed-off-by: Andrew Morton <[email protected]>

commit 6617da8fb565445e0be4a4885443006374943d09
Author: Juergen Gross <[email protected]>
Date:   Wed Nov 30 14:49:41 2022 -0800

    mm: add dummy pmd_young() for architectures not having it

    In order to avoid #ifdeffery add a dummy pmd_young() implementation as a
    fallback.  This is required for the later patch "mm: introduce
    arch_has_hw_nonleaf_pmd_young()".

    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Juergen Gross <[email protected]>
    Acked-by: Yu Zhao <[email protected]>
    Cc: Borislav Petkov <[email protected]>
    Cc: Dave Hansen <[email protected]>
    Cc: Geert Uytterhoeven <[email protected]>
    Cc: "H. Peter Anvin" <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Sander Eikelenboom <[email protected]>
    Cc: Thomas Gleixner <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>

commit 95bc35f9bee5220dad4e8567654ab3288a181639
Author: SeongJae Park <[email protected]>
Date:   Tue Nov 22 19:48:31 2022 +0000

    mm/damon/sysfs: fix wrong empty schemes assumption under online tuning in damon_sysfs_set_schemes()

    Commit da87878010e5 ("mm/damon/sysfs: support online inputs update") made
    'damon_sysfs_set_schemes()' to be called for running DAMON context, which
    could have schemes.  In the case, DAMON sysfs interface is supposed to
    update, remove, or add schemes to reflect the sysfs files.  However, the
    code is assuming the DAMON context wouldn't have schemes at all, and
    therefore creates and adds new schemes.  As a result, the code doesn't
    work as intended for online schemes tuning and could have more than
    expected memory footprint.  The schemes are all in the DAMON context, so
    it doesn't leak the memory, though.

    Remove the wrong asssumption (the DAMON context wouldn't have schemes) in
    'damon_sysfs_set_schemes()' to fix the bug.

    Link: https://lkml.kernel.org/r/[email protected]
    Fixes: da87878010e5 ("mm/damon/sysfs: support online inputs update")
    Signed-off-by: SeongJae Park <[email protected]>
    Cc: <[email protected]>	[5.19+]
    Signed-off-by: Andrew Morton <[email protected]>

commit a435874bf626f55d7147026b059008c8de89fbb8
Author: Tiezhu Yang <[email protected]>
Date:   Sat Nov 19 10:36:59 2022 +0800

    tools/vm/slabinfo-gnuplot: use "grep -E" instead of "egrep"

    The latest version of grep claims the egrep is now obsolete so the build
    now contains warnings that look like:

    	egrep: warning: egrep is obsolescent; using grep -E

    fix this up by moving the related file to use "grep -E" instead.

      sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/vm`

    Here are the steps to install the latest grep:

      wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
      tar xf grep-3.8.tar.gz
      cd grep-3.8 && ./configure && make
      sudo make install
      export PATH=/usr/local/bin:$PATH

    Link: https://lkml.kernel.org/r/[email protected]
    Signed-off-by: Tiezhu Yang <[email protected]>
    Reviewed-by: Sergey Senozhatsky <[email protected]>
    Cc: Vlastimil Babka <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Andrew Morton <[email protected]>

commit f0a0ccda18d6fd826d7c7e7ad48a6ed61c20f8b4
Author: ZhangPeng <[email protected]>
Date:   Sat Nov 19 21:05:42 2022 +0900

    nilfs2: fix NULL pointer dereference in nilfs_palloc_commit_free_entry()

    Syzbot reported a null-ptr-deref bug:

     NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP
     frequency < 30 seconds
     general protection fault, probably for non-canonical address
     0xdffffc0000000002: 0000 [#1] PREEMPT SMP KASAN
     KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
     CPU: 1…
earlAchromatic added a commit to earlAchromatic/linux that referenced this pull request Oct 4, 2023
* Merge tag 'ext4_for_linus_urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fix from Ted Ts'o:
 "Fix a double unlock bug on an error path in ext4, found by smatch and
  syzkaller"

* tag 'ext4_for_linus_urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix possible double unlock when moving a directory

* Merge tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:
 "There's a little bit more 'movement' in there for my taste but it
  needs to happen and should make the code better after it.

   - Check cmdline_find_option()'s return value before further
     processing

   - Clear temporary storage in the resctrl code to prevent access to an
     unexistent MSR

   - Add a simple throttling mechanism to protect the hypervisor from
     potentially malicious SEV guests issuing requests in rapid
     succession.

     In order to not jeopardize the sanity of everyone involved in
     maintaining this code, the request issuing side has received a
     cleanup, split in more or less trivial, small and digestible
     pieces. Otherwise, the code was threatening to become an
     unmaintainable mess.

     Therefore, that cleanup is marked indirectly also for stable so
     that there's no differences between the upstream code and the
     stable variant when it comes down to backporting more there"

* tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Fix use of uninitialized buffer in sme_enable()
  x86/resctrl: Clear staged_config[] before and after it is used
  virt/coco/sev-guest: Add throttling awareness
  virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
  virt/coco/sev-guest: Do some code style cleanups
  virt/coco/sev-guest: Carve out the request issuing logic into a helper
  virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()
  virt/coco/sev-guest: Simplify extended guest request handling
  virt/coco/sev-guest: Check SEV_SNP attribute at probe time

* Merge tag 'perf_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Borislav Petkov:

 - Check whether sibling events have been deactivated before adding them
   to groups

 - Update the proper event time tracking variable depending on the event
   type

 - Fix a memory overwrite issue due to using the wrong function argument
   when outputting perf events

* tag 'perf_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Fix check before add_event_to_groups() in perf_group_detach()
  perf: fix perf_event_context->time
  perf/core: Fix perf_output_begin parameter is incorrectly invoked in perf_event_bpf_output

* xfs: walk all AGs if TRYLOCK passed to xfs_alloc_vextent_iterate_ags

Callers of xfs_alloc_vextent_iterate_ags that pass in the TRYLOCK flag
want us to perform a non-blocking scan of the AGs for free space.  There
are no ordering constraints for non-blocking AGF lock acquisition, so
the scan can freely start over at AG 0 even when minimum_agno > 0.

This manifests fairly reliably on xfs/294 on 6.3-rc2 with the parent
pointer patchset applied and the realtime volume enabled.  I observed
the following sequence as part of an xfs_dir_createname call:

0. Fragment the free space, then allocate nearly all the free space in
   all AGs except AG 0.

1. Create a directory in AG 2 and let it grow for a while.

2. Try to allocate 2 blocks to expand the dirent part of a directory.
   The space will be allocated out of AG 0, but the allocation will not
   be contiguous.  This (I think) activates the LOWMODE allocator.

3. The bmapi call decides to convert from extents to bmbt format and
   tries to allocate 1 block.  This allocation request calls
   xfs_alloc_vextent_start_ag with the inode number, which starts the
   scan at AG 2.  We ignore AG 0 (with all its free space) and instead
   scrape AG 2 and 3 for more space.  We find one block, but this now
   kicks t_highest_agno to 3.

4. The createname call decides it needs to split the dabtree.  It tries
   to allocate even more space with xfs_alloc_vextent_start_ag, but now
   we're constrained to AG 3, and we don't find the space.  The
   createname returns ENOSPC and the filesystem shuts down.

This change fixes the problem by making the trylock scan wrap around to
AG 0 if it doesn't like the AGs that it finds.  Since the current
transaction itself holds AGF 0, the trylock of AGF 0 will succeed, and
we take space from the AG that has plenty.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>

* xfs: add tracepoints for each of the externally visible allocators

There are now five separate space allocator interfaces exposed to the
rest of XFS for five different strategies to find space.  Add
tracepoints for each of them so that I can tell from a trace dump
exactly which ones got called and what happened underneath them.  Add a
sixth so it's more obvious if an allocation actually happened.

Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>

* xfs: test dir/attr hash when loading module

Back in the 6.2-rc1 days, Eric Whitney reported a fstests regression in
ext4 against generic/454.  The cause of this test failure was the
unfortunate combination of setting an xattr name containing UTF8 encoded
emoji, an xattr hash function that accepted a char pointer with no
explicit signedness, signed type extension of those chars to an int, and
the 6.2 build tools maintainers deciding to mandate -funsigned-char
across the board.  As a result, the ondisk extended attribute structure
written out by 6.1 and 6.2 were not the same.

This discrepancy, in fact, had been noticeable if a filesystem with such
an xattr were moved between any two architectures that don't employ the
same signedness of a raw "char" declaration.  The only reason anyone
noticed is that x86 gcc defaults to signed, and no such -funsigned-char
update was made to e2fsprogs, so e2fsck immediately started reporting
data corruption.

After a day and a half of discussing how to handle this use case (xattrs
with bit 7 set anywhere in the name) without breaking existing users,
Linus merged his own patch and didn't tell the maintainer.  None of the
ext4 developers realized this until AUTOSEL announced that the commit
had been backported to stable.

In the end, this problem could have been detected much earlier if there
had been any useful tests of hash function(s) in use inside ext4 to make
sure that they always produce the same outputs given the same inputs.

The XFS dirent/xattr name hash takes a uint8_t*, so I don't think it's
vulnerable to this problem.  However, let's avoid all this drama by
adding our own self test to check that the da hash produces the same
outputs for a static pile of inputs on various platforms.  This enables
us to fix any breakage that may result in a controlled fashion.  The
buffer and test data are identical to the patches submitted to xfsprogs.

Link: https://lore.kernel.org/linux-ext4/Y8bpkm3jA3bDm3eL@debian-BULLSEYE-live-builder-AMD64/
Link: https://lore.kernel.org/linux-xfs/ZBUKCRR7xvIqPrpX@destitution/T/#md38272cc684e2c0d61494435ccbb91f022e8dee4
Signed-off-by: Darrick J. Wong <[email protected]>
Reviewed-by: Dave Chinner <[email protected]>

* Merge tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull RAS fix from Borislav Petkov:

 - Flush out logged errors immediately after MCA banks configuration
   changes over sysfs have been done instead of waiting until something
   else triggers the workqueue later - another error or the polling
   interval cycle is reached

* tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Make sure logged MCEs are processed after sysfs update

* cpumask: introduce for_each_cpu_or

Equivalent of for_each_cpu_and, except it ORs the two masks together
so it iterates all the CPUs present in either mask.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

* pcpcntrs: fix dying cpu summation race

In commit f689054aace2 ("percpu_counter: add percpu_counter_sum_all
interface") a race condition between a cpu dying and
percpu_counter_sum() iterating online CPUs was identified. The
solution was to iterate all possible CPUs for summation via
percpu_counter_sum_all().

We recently had a percpu_counter_sum() call in XFS trip over this
same race condition and it fired a debug assert because the
filesystem was unmounting and the counter *should* be zero just
before we destroy it. That was reported here:

https://lore.kernel.org/linux-kernel/[email protected]/

likely as a result of running generic/648 which exercises
filesystems in the presence of CPU online/offline events.

The solution to use percpu_counter_sum_all() is an awful one. We
use percpu counters and percpu_counter_sum() for accurate and
reliable threshold detection for space management, so a summation
race condition during these operations can result in overcommit of
available space and that may result in filesystem shutdowns.

As percpu_counter_sum_all() iterates all possible CPUs rather than
just those online or even those present, the mask can include CPUs
that aren't even installed in the machine, or in the case of
machines that can hot-plug CPU capable nodes, even have physical
sockets present in the machine.

Fundamentally, this race condition is caused by the CPU being
offlined being removed from the cpu_online_mask before the notifier
that cleans up per-cpu state is run. Hence percpu_counter_sum() will
not sum the count for a cpu currently being taken offline,
regardless of whether the notifier has run or not. This is
the root cause of the bug.

The percpu counter notifier iterates all the registered counters,
locks the counter and moves the percpu count to the global sum.
This is serialised against other operations that move the percpu
counter to the global sum as well as percpu_counter_sum() operations
that sum the percpu counts while holding the counter lock.

Hence the notifier is safe to run concurrently with sum operations,
and the only thing we actually need to care about is that
percpu_counter_sum() iterates dying CPUs. That's trivial to do,
and when there are no CPUs dying, it has no addition overhead except
for a cpumask_or() operation.

This change makes percpu_counter_sum() always do the right thing in
the presence of CPU hot unplug events and makes
percpu_counter_sum_all() unnecessary. This, in turn, means that
filesystems like XFS, ext4, and btrfs don't have to work out when
they should use percpu_counter_sum() vs percpu_counter_sum_all() in
their space accounting algorithms

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

* fork: remove use of percpu_counter_sum_all

This effectively reverts the change made in commit f689054aace2
("percpu_counter: add percpu_counter_sum_all interface") as the
race condition percpu_counter_sum_all() was invented to avoid is
now handled directly in percpu_counter_sum() and nobody needs to
care about summing racing with cpu unplug anymore.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

* pcpcntr: remove percpu_counter_sum_all()

percpu_counter_sum_all() is now redundant as the race condition it
was invented to handle is now dealt with by percpu_counter_sum()
directly and all users of percpu_counter_sum_all() have been
removed.

Remove it.

This effectively reverts the changes made in f689054aace2
("percpu_counter: add percpu_counter_sum_all interface") except for
the cpumask iteration that fixes percpu_counter_sum() made earlier
in this series.

Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Darrick J. Wong <[email protected]>
Signed-off-by: Darrick J. Wong <[email protected]>

* Merge tag 'char-misc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are a few small char/misc/other driver subsystem patches to
  resolve reported problems for 6.3-rc3.

  Included in here are:

   - Interconnect driver fixes for reported problems

   - Memory driver fixes for reported problems

   - nvmem core fix

   - firmware driver fix for reported problem

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (23 commits)
  memory: tegra30-emc: fix interconnect registration race
  memory: tegra20-emc: fix interconnect registration race
  memory: tegra124-emc: fix interconnect registration race
  memory: tegra: fix interconnect registration race
  interconnect: exynos: drop redundant link destroy
  interconnect: exynos: fix registration race
  interconnect: exynos: fix node leak in probe PM QoS error path
  interconnect: qcom: msm8974: fix registration race
  interconnect: qcom: rpmh: fix registration race
  interconnect: qcom: rpmh: fix probe child-node error handling
  interconnect: qcom: rpm: fix registration race
  nvmem: core: return -ENOENT if nvmem cell is not found
  firmware: xilinx: don't make a sleepable memory allocation from an atomic context
  interconnect: qcom: rpm: fix probe child-node error handling
  interconnect: qcom: osm-l3: fix registration race
  interconnect: imx: fix registration race
  interconnect: fix provider registration API
  interconnect: fix icc_provider_del() error handling
  interconnect: fix mem leak when freeing nodes
  interconnect: qcom: qcm2290: Fix MASTER_SNOC_BIMC_NRT
  ...

* Merge tag 'tty-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty/serial driver fixes from Greg KH:
 "Here are some small tty and serial driver fixes for 6.3-rc3 to resolve
  some reported issues.

  They include:

   - 8250 driver Kconfig issue pointed out by you that showed up in -rc1

   - qcom-geni serial driver fixes

   - various 8250 driver fixes for reported problems

   - fsl_lpuart driver fixes

   - serdev fix for regression in -rc1

   - vt.c bugfix

  All have been in linux-next for over a week with no reported problems"

* tag 'tty-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  tty: vt: protect KD_FONT_OP_GET_TALL from unbound access
  serial: qcom-geni: drop bogus uart_write_wakeup()
  serial: qcom-geni: fix mapping of empty DMA buffer
  serial: qcom-geni: fix DMA mapping leak on shutdown
  serial: qcom-geni: fix console shutdown hang
  serdev: Set fwnode for serdev devices
  tty: serial: fsl_lpuart: fix race on RX DMA shutdown
  serial: 8250_pci1xxxx: Disable SERIAL_8250_PCI1XXXX config by default
  serial: 8250_fsl: fix handle_irq locking
  serial: 8250_em: Fix UART port type
  serial: 8250: ASPEED_VUART: select REGMAP instead of depending on it
  tty: serial: fsl_lpuart: skip waiting for transmission complete when UARTCTRL_SBK is asserted
  Revert "tty: serial: fsl_lpuart: adjust SERIAL_FSL_LPUART_CONSOLE config dependency"

* tracing: Make splice_read available again

Since the commit 36e2c7421f02 ("fs: don't allow splice read/write
without explicit ops") is applied to the kernel, splice() and
sendfile() calls on the trace file (/sys/kernel/debug/tracing
/trace) return EINVAL.

This patch restores these system calls by initializing splice_read
in file_operations of the trace file. This patch only enables such
functionalities for the read case.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

Cc: [email protected]
Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: Sung-hun Kim <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

* ring-buffer: remove obsolete comment for free_buffer_page()

The comment refers to mm/slob.c which is being removed. It comes from
commit ed56829cb319 ("ring_buffer: reset buffer page when freeing") and
according to Steven the borrowed code was a page mapcount and mapping
reset, which was later removed by commit e4c2ce82ca27 ("ring_buffer:
allocate buffer page pointer"). Thus the comment is not accurate anyway,
remove it.

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

Cc: Masami Hiramatsu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Reported-by: Mike Rapoport <[email protected]>
Suggested-by: Steven Rostedt (Google) <[email protected]>
Fixes: e4c2ce82ca27 ("ring_buffer: allocate buffer page pointer")
Signed-off-by: Vlastimil Babka <[email protected]>
Reviewed-by: Mukesh Ojha <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

* tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr

There is a problem with the behavior of hwlat in a container,
resulting in incorrect output. A warning message is generated:
"cpumask changed while in round-robin mode, switching to mode none",
and the tracing_cpumask is ignored. This issue arises because
the kernel thread, hwlatd, is not a part of the container, and
the function sched_setaffinity is unable to locate it using its PID.
Additionally, the task_struct of hwlatd is already known.
Ultimately, the function set_cpus_allowed_ptr achieves
the same outcome as sched_setaffinity, but employs task_struct
instead of PID.

Test case:

  # cd /sys/kernel/tracing
  # echo 0 > tracing_on
  # echo round-robin > hwlat_detector/mode
  # echo hwlat > current_tracer
  # unshare --fork --pid bash -c 'echo 1 > tracing_on'
  # dmesg -c

Actual behavior:

[573502.809060] hwlat_detector: cpumask changed while in round-robin mode, switching to mode none

Link: https://lore.kernel.org/linux-trace-kernel/[email protected]

Cc: Masami Hiramatsu <[email protected]>
Fixes: 0330f7aa8ee63 ("tracing: Have hwlat trace migrate across tracing_cpumask CPUs")
Signed-off-by: Costa Shulyupin <[email protected]>
Acked-by: Daniel Bristot de Oliveira <[email protected]>
Signed-off-by: Steven Rostedt (Google) <[email protected]>

* Merge tag 'trace-v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fixes from Steven Rostedt:

 - Fix setting affinity of hwlat threads in containers

   Using sched_set_affinity() has unwanted side effects when being
   called within a container. Use set_cpus_allowed_ptr() instead

 - Fix per cpu thread management of the hwlat tracer:
    - Do not start per_cpu threads if one is already running for the CPU
    - When starting per_cpu threads, do not clear the kthread variable
      as it may already be set to running per cpu threads

 - Fix return value for test_gen_kprobe_cmd()

   On error the return value was overwritten by being set to the result
   of the call from kprobe_event_delete(), which would likely succeed,
   and thus have the function return success

 - Fix splice() reads from the trace file that was broken by commit
   36e2c7421f02 ("fs: don't allow splice read/write without explicit
   ops")

 - Remove obsolete and confusing comment in ring_buffer.c

   The original design of the ring buffer used struct page flags for
   tricks to optimize, which was shortly removed due to them being
   tricks. But a comment for those tricks remained

 - Set local functions and variables to static

* tag 'trace-v6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/hwlat: Replace sched_setaffinity with set_cpus_allowed_ptr
  ring-buffer: remove obsolete comment for free_buffer_page()
  tracing: Make splice_read available again
  ftrace: Set direct_ops storage-class-specifier to static
  trace/hwlat: Do not start per-cpu thread if it is already running
  trace/hwlat: Do not wipe the contents of per-cpu thread data
  tracing/osnoise: set several trace_osnoise.c variables storage-class-specifier to static
  tracing: Fix wrong return in kprobe_event_gen_test.c

* Linux 6.3-rc3

* thunderbolt: Use const qualifier for `ring_interrupt_index`

`ring_interrupt_index` doesn't change the data for `ring` so mark it as
const. This is needed by the following patch that disables interrupt
auto clear for rings.

Cc: Sanju Mehta <[email protected]>
Cc: [email protected]
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>

* thunderbolt: Disable interrupt auto clear for rings

When interrupt auto clear is programmed, any read to the interrupt
status register will clear all interrupts.  If two interrupts have
come in before one can be serviced then this will cause lost interrupts.

On AMD USB4 routers this has manifested in odd problems particularly
with long strings of control tranfers such as reading the DROM via bit
banging.

Instead of clearing interrupts automatically, clear the bit corresponding
to the given ring's interrupt in the ISR.

Fixes: 7a1808f82a37 ("thunderbolt: Handle ring interrupt by reading interrupt status register")
Cc: Sanju Mehta <[email protected]>
Cc: [email protected]
Tested-by: Anson Tsao <[email protected]>
Signed-off-by: Mario Limonciello <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>

* drm/i915/mtl: Fix Wa_16015201720 implementation

The commit 2357f2b271ad ("drm/i915/mtl: Initial display workarounds")
extended the workaround Wa_16015201720 to MTL. However the registers
that the original WA implemented moved for MTL.

Implement the workaround with the correct register.

v3: Skip clock gating for pipe C, D DMC's and fix the title

Fixes: 2357f2b271ad ("drm/i915/mtl: Initial display workarounds")
Cc: Matt Atwood <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Signed-off-by: Radhakrishna Sripada <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 0188be507b973e36f637ba010a369057c8cb7282)
Signed-off-by: Jani Nikula <[email protected]>

* drm/i915/fbdev: lock the fbdev obj before vma pin

lock the fbdev obj before calling into
i915_vma_pin_iomap(). This helps to solve below :

<7>[   93.563308] i915 0000:00:02.0: [drm:intelfb_create [i915]] no BIOS fb, allocating a new one
<4>[   93.581844] ------------[ cut here ]------------
<4>[   93.581855] WARNING: CPU: 12 PID: 625 at drivers/gpu/drm/i915/gem/i915_gem_pages.c:424 i915_gem_object_pin_map+0x152/0x1c0 [i915]

Fixes: f0b6b01b3efe ("drm/i915: Add ww context to intel_dpt_pin, v2.")
Cc: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Maarten Lankhorst <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Radhakrishna Sripada <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 561b31acfd65502a2cda2067513240fc57ccdbdc)
Signed-off-by: Jani Nikula <[email protected]>

* drm/i915: Preserve crtc_state->inherited during state clearing

intel_crtc_prepare_cleared_state() is unintentionally losing
the "inherited" flag. This will happen if intel_initial_commit()
is forced to go through the full modeset calculations for
whatever reason.

Afterwards the first real commit from userspace will not get
forced to the full modeset path, and thus eg. audio state may
not get recomputed properly. So if the monitor was already
enabled during boot audio will not work until userspace itself
does an explicit full modeset.

Cc: [email protected]
Tested-by: Lee Shawn C <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Uma Shankar <[email protected]>
(cherry picked from commit 2553bacaf953b48c59357f5a622282bc0c45adae)
Signed-off-by: Jani Nikula <[email protected]>

* drm/i915/mtl: Disable MC6 for MTL A step

The Wa_14017073508 require to send Media Busy/Idle mailbox while
accessing Media tile. As of now it is getting handled while __gt_unpark,
__gt_park. But there are various corner cases where forcewakes are taken
without __gt_unpark i.e. without sending Busy Mailbox especially during
register reads. Forcewakes are taken without busy mailbox leads to
GPU HANG. So bringing mailbox calls under forcewake calls are no feasible
option as forcewake calls are atomic and mailbox calls are blocking.
The issue already fixed in B step so disabling MC6 on A step and
reverting previous commit which handles Wa_14017073508

Fixes: 8f70f1ec587d ("drm/i915/mtl: Add Wa_14017073508 for SAMedia")
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Badal Nilawar <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 038a24835ab68f341eaa7a0e3bcc6ce0f9b22e17)
Signed-off-by: Jani Nikula <[email protected]>

* drm/i915/guc: Fix missing ecodes

Error captures are tagged with an 'ecode'. This is a pseduo-unique magic
number that is meant to distinguish similar seeming bugs with
different underlying signatures. It is a combination of two ring state
registers. Unfortunately, the register state being used is only valid
in execlist mode. In GuC mode, the register state exists in a separate
list of arbitrary register address/value pairs rather than the named
entry structure. So, search through that list to find the two exciting
registers and copy them over to the structure's named members.

v2: if else if instead of if if (Alan)

Signed-off-by: John Harrison <[email protected]>
Reviewed-by: Alan Previn <[email protected]>
Fixes: a6f0f9cf330a ("drm/i915/guc: Plumb GuC-capture into gpu_coredump")
Cc: Alan Previn <[email protected]>
Cc: Umesh Nerlige Ramappa <[email protected]>
Cc: Lucas De Marchi <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Aravind Iddamsetty <[email protected]>
Cc: Michael Cheng <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Bruce Chang <[email protected]>
Cc: Daniele Ceraolo Spurio <[email protected]>
Cc: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 9724ecdbb9ddd6da3260e4a442574b90fc75188a)
Signed-off-by: Jani Nikula <[email protected]>

* drm/i915/active: Fix missing debug object activation

debug_active_activate() expected ref->count to be zero
which is not true anymore as __i915_active_activate() calls
debug_active_activate() after incrementing the count.

v2: No need to check for "ref->count == 1" as __i915_active_activate()
already make sure of that(Janusz).

References: https://gitlab.freedesktop.org/drm/intel/-/issues/6733
Fixes: 04240e30ed06 ("drm/i915: Skip taking acquire mutex for no ref->active callback")
Cc: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: Andi Shyti <[email protected]>
Cc: [email protected]
Cc: Janusz Krzysztofik <[email protected]>
Cc: <[email protected]> # v5.10+
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Janusz Krzysztofik <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit bfad380c542438a9b642f8190b7fd37bc77e2723)
Signed-off-by: Jani Nikula <[email protected]>

* drm/i915/gt: perform uc late init after probe error injection

Probe pseudo errors should be injected only in places where real errors
can be encountered, otherwise unwinding code can be broken.
Placing intel_uc_init_late before i915_inject_probe_error violated
this rule, resulting in following bug:
__intel_gt_disable:655 GEM_BUG_ON(intel_gt_pm_is_awake(gt))

Fixes: 481d458caede ("drm/i915/guc: Add golden context to GuC ADS")
Acked-by: Nirmoy Das <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit c4252a11131c7f27a158294241466e2a4e7ff94e)
Signed-off-by: Jani Nikula <[email protected]>

* drm/i915: Fix format for perf_limit_reasons

Use hex format so that it is easier to decode.

Fixes: fe5979665f64 ("drm/i915/debugfs: Add perf_limit_reasons in debugfs")
Signed-off-by: Vinay Belgaumkar <[email protected]>
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 5e008ba67cb80084e99b40ccd46f9029ae421632)
Signed-off-by: Jani Nikula <[email protected]>

* drm/i915: Update vblank timestamping stuff on seamless M/N change

When we change the M/N values seamlessly during a fastset we should
also update the vblank timestamping stuff to make sure the vblank
timestamp corrections/guesstimations come out exact.

Note that only crtc_clock and framedur_ns can actually end up
changing here during fastsets. Everything else we touch can
only change during full modesets.

Technically we should try to do this exactly at the start of
vblank, but that would require some kind of double buffering
scheme. Let's skip that for now and just update things right
after the commit has been submitted to the hardware. This
means the information will be properly up to date when the
vblank irq handler goes to work. Only if someone ends up
querying some vblanky stuff in between the commit and start
of vblank may we see a slight discrepancy.

Also this same problem really exists for the DRRS downclocking
stuff. But as that is supposed to be more or less transparent
to the user, and it only drops to low gear after a long delay
(1 sec currently) we probably don't have to worry about it.
Any time something is actively submitting updates DRRS will
remain in high gear and so the timestamping constants will
match the hardware state.

Reviewed-by: Jani Nikula <[email protected]>
Reviewed-by: Mitul Golani <[email protected]>
Fixes: e6f29923c048 ("drm/i915: Allow M/N change during fastset on bdw+")
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 8cb1f95cca68421b08333175719fdd3615372ca8)
Signed-off-by: Jani Nikula <[email protected]>

* net: dsa: report rx_bytes unadjusted for ETH_HLEN

We collect the software statistics counters for RX bytes (reported to
/proc/net/dev and to ethtool -S $dev | grep 'rx_bytes: ") at a time when
skb->len has already been adjusted by the eth_type_trans() ->
skb_pull_inline(skb, ETH_HLEN) call to exclude the L2 header.

This means that when connecting 2 DSA interfaces back to back and
sending 1 packet with length 100, the sending interface will report
tx_bytes as incrementing by 100, and the receiving interface will report
rx_bytes as incrementing by 86.

Since accounting for that in scripts is quirky and is something that
would be DSA-specific behavior (requiring users to know that they are
running on a DSA interface in the first place), the proposal is that we
treat it as a bug and fix it.

This design bug has always existed in DSA, according to my analysis:
commit 91da11f870f0 ("net: Distributed Switch Architecture protocol
support") also updates skb->dev->stats.rx_bytes += skb->len after the
eth_type_trans() call. Technically, prior to Florian's commit
a86d8becc3f0 ("net: dsa: Factor bottom tag receive functions"), each and
every vendor-specific tagging protocol driver open-coded the same bug,
until the buggy code was consolidated into something resembling what can
be seen now. So each and every driver should have its own Fixes: tag,
because of their different histories until the convergence point.
I'm not going to do that, for the sake of simplicity, but just blame the
oldest appearance of buggy code.

There are 2 ways to fix the problem. One is the obvious way, and the
other is how I ended up doing it. Obvious would have been to move
dev_sw_netstats_rx_add() one line above eth_type_trans(), and below
skb_push(skb, ETH_HLEN). But DSA processing is not as simple as that.
We count the bytes after removing everything DSA-related from the
packet, to emulate what the packet's length was, on the wire, when the
user port received it.

When eth_type_trans() executes, dsa_untag_bridge_pvid() has not run yet,
so in case the switch driver requests this behavior - commit
412a1526d067 ("net: dsa: untag the bridge pvid from rx skbs") has the
details - the obvious variant of the fix wouldn't have worked, because
the positioning there would have also counted the not-yet-stripped VLAN
header length, something which is absent from the packet as seen on the
wire (there it may be untagged, whereas software will see it as
PVID-tagged).

Fixes: f613ed665bb3 ("net: dsa: Add support for 64-bit statistics")
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

* net: qcom/emac: Fix use after free bug in emac_remove due to race condition

In emac_probe, &adpt->work_thread is bound with
emac_work_thread. Then it will be started by timeout
handler emac_tx_timeout or a IRQ handler emac_isr.

If we remove the driver which will call emac_remove
  to make cleanup, there may be a unfinished work.

The possible sequence is as follows:

Fix it by finishing the work before cleanup in the emac_remove
and disable timeout response.

CPU0                  CPU1

                    |emac_work_thread
emac_remove         |
free_netdev         |
kfree(netdev);      |
                    |emac_reinit_locked
                    |emac_mac_down
                    |//use netdev
Fixes: b9b17debc69d ("net: emac: emac gigabit ethernet controller driver")
Signed-off-by: Zheng Wang <[email protected]>

Signed-off-by: David S. Miller <[email protected]>

* net: usb: lan78xx: Limit packet length to skb->len

Packet length retrieved from descriptor may be larger than
the actual socket buffer length. In such case the cloned
skb passed up the network stack will leak kernel memory contents.

Additionally prevent integer underflow when size is less than
ETH_FCS_LEN.

Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Szymon Heidrich <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

* usb: plusb: remove unused pl_clear_QuickLink_features function

clang with W=1 reports
drivers/net/usb/plusb.c:65:1: error:
  unused function 'pl_clear_QuickLink_features' [-Werror,-Wunused-function]
pl_clear_QuickLink_features(struct usbnet *dev, int val)
^
This static function is not used, so remove it.

Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

* net/ps3_gelic_net: Fix RX sk_buff length

The Gelic Ethernet device needs to have the RX sk_buffs aligned to
GELIC_NET_RXBUF_ALIGN, and also the length of the RX sk_buffs must
be a multiple of GELIC_NET_RXBUF_ALIGN.

The current Gelic Ethernet driver was not allocating sk_buffs large
enough to allow for this alignment.

Also, correct the maximum and minimum MTU sizes, and add a new
preprocessor macro for the maximum frame size, GELIC_NET_MAX_FRAME.

Fixes various randomly occurring runtime network errors.

Fixes: 02c1889166b4 ("ps3: gigabit ethernet driver for PS3, take3")
Signed-off-by: Geoff Levand <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

* net/ps3_gelic_net: Use dma_mapping_error

The current Gelic Etherenet driver was checking the return value of its
dma_map_single call, and not using the dma_mapping_error() routine.

Fixes runtime problems like these:

  DMA-API: ps3_gelic_driver sb_05: device driver failed to check map error
  WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:1027 .check_unmap+0x888/0x8dc

Fixes: 02c1889166b4 ("ps3: gigabit ethernet driver for PS3, take3")
Reviewed-by: Alexander Duyck <[email protected]>
Signed-off-by: Geoff Levand <[email protected]>
Signed-off-by: David S. Miller <[email protected]>

* Merge branch 'ps3_gelic_net-fixes'

Geoff Levand says:

====================
net/ps3_gelic_net: DMA related fixes

v9: Make rx_skb_size local to gelic_descr_prepare_rx.
v8: Add more cpu_to_be32 calls.
v7: Remove all cleanups, sync to spider net.
v6: Reworked and cleaned up patches.
v5: Some additional patch cleanups.
v4: More patch cleanups.
v3: Cleaned up patches as requested.
====================

Signed-off-by: David S. Miller <[email protected]>

* Revert "drm/i915/hwmon: Enable PL1 power limit"

This reverts commit ee892ea83d99610fa33bea612de058e0955eec3a.

It was accidentally picked up for backporting. Revert.

Cc: Jani Nikula <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

* thunderbolt: Rename shadowed variables bit to interrupt_bit and auto_clear_bit

cppcheck reports
drivers/thunderbolt/nhi.c:74:7: style: Local variable 'bit' shadows outer variable [shadowVariable]
  int bit;
      ^
drivers/thunderbolt/nhi.c:66:6: note: Shadowed declaration
 int bit = ring_interrupt_index(ring) & 31;
     ^
drivers/thunderbolt/nhi.c:74:7: note: Shadow variable
  int bit;
      ^
For readablity rename the outer to interrupt_bit and the innner
to auto_clear_bit.

Fixes: 468c49f44759 ("thunderbolt: Disable interrupt auto clear for ring")
Cc: [email protected]
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>

* Merge tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:

 - Fix /proc/PID/io read_bytes accounting

 - Fix setting NLM file_lock start and end during decoding testargs

 - Fix timing for setting access cache timestamps

* tag 'nfs-for-6.3-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  NFS: Correct timing for assigning access cache timestamp
  lockd: set file_lock start and end when decoding nlm4 testargs
  NFS: Fix /proc/PID/io read_bytes for buffered reads

* ACPI: video: Add backlight=native DMI quirk for Acer Aspire 3830TG

The Acer Aspire 3830TG predates Windows 8, so it defaults to using
acpi_video# for backlight control, but this is non functional on
this model.

Add a DMI quirk to use the native backlight interface which does
work properly.

Signed-off-by: Hans de Goede <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>

* gpu: host1x: fix uninitialized variable use

The error handling for platform_get_irq() failing no longer works after
a recent change, clang now points this out with a warning:

  drivers/gpu/host1x/dev.c:520:6: error: variable 'syncpt_irq' is uninitialized when used here [-Werror,-Wuninitialized]
          if (syncpt_irq < 0)
              ^~~~~~~~~~

Fix this by removing the variable and checking the correct error status.

Fixes: 625d4ffb438c ("gpu: host1x: Rewrite syncpoint interrupt handling")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Jon Hunter <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Reviewed-by: Mikko Perttunen <[email protected]>
Reviewed-by: Nathan Chancellor <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>

* zonefs: Prevent uninitialized symbol 'size' warning

In zonefs_file_dio_append(), initialize the variable size to 0 to
prevent compilation and static code analizers warning such as:

New smatch warnings:
fs/zonefs/file.c:441 zonefs_file_dio_append() error: uninitialized
symbol 'size'.

The warning is a false positive as size is never actually used
uninitialized.

No functional change.

Reported-by: kernel test robot <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]/
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Himanshu Madhani <[email protected]>

* zonefs: Fix error message in zonefs_file_dio_append()

Since the expected write location in a sequential file is always at the
end of the file (append write), when an invalid write append location is
detected in zonefs_file_dio_append(), print the invalid written location
instead of the expected write location.

Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations")
Cc: [email protected]
Signed-off-by: Damien Le Moal <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Reviewed-by: Johannes Thumshirn <[email protected]>
Reviewed-by: Himanshu Madhani <[email protected]>

* Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux

Pull fscrypt fix from Eric Biggers:
 "Fix a bug where when a filesystem was being unmounted, the fscrypt
  keyring was destroyed before inodes have been released by the Landlock
  LSM.

  This bug was found by syzbot"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
  fscrypt: check for NULL keyring in fscrypt_put_master_key_activeref()
  fscrypt: improve fscrypt_destroy_keyring() documentation
  fscrypt: destroy keyring after security_sb_delete()

* Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux

Pull fsverity fixes from Eric Biggers:
 "Fix two significant performance issues with fsverity"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
  fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
  fsverity: Remove WQ_UNBOUND from fsverity read workqueue

* block/io_uring: pass in issue_flags for uring_cmd task_work handling

io_uring_cmd_done() currently assumes that the uring_lock is held
when invoked, and while it generally is, this is not guaranteed.
Pass in the issue_flags associated with it, so that we have
IO_URING_F_UNLOCKED available to be able to lock the CQ ring
appropriately when completing events.

Cc: [email protected]
Fixes: ee692a21e9bf ("fs,io_uring: add infrastructure for uring-cmd")
Signed-off-by: Jens Axboe <[email protected]>

* io_uring/net: avoid sending -ECONNABORTED on repeated connection requests

Since io_uring does nonblocking connect requests, if we do two repeated
ones without having a listener, the second will get -ECONNABORTED rather
than the expected -ECONNREFUSED. Treat -ECONNABORTED like a normal retry
condition if we're nonblocking, if we haven't already seen it.

Cc: [email protected]
Fixes: 3fb1bd688172 ("io_uring/net: handle -EINPROGRESS correct for IORING_OP_CONNECT")
Link: https://github.com/axboe/liburing/issues/828
Reported-by: Hui, Chunyang <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>

* octeontx2-vf: Add missing free for alloc_percpu

Add the free_percpu for the allocated "vf->hw.lmt_info" in order to avoid
memory leak, same as the "pf->hw.lmt_info" in
`drivers/net/ethernet/marvell/octeontx2/nic/otx2_pf.c`.

Fixes: 5c0512072f65 ("octeontx2-pf: cn10k: Use runtime allocated LMTLINE region")
Signed-off-by: Jiasheng Jiang <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Acked-by: Geethasowjanya Akula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>

* drm: panel-orientation-quirks: Add quirk for Lenovo Yoga Book X90F

Like the Windows Lenovo Yoga Book X91F/L the Android Lenovo Yoga Book
X90F/L has a portrait 1200x1920 screen used in landscape mode,
add a quirk for this.

When the quirk for the X91F/L was initially added it was written to
also apply to the X90F/L but this does not work because the Android
version of the Yoga Book uses completely different DMI strings.
Also adjust the X91F/L quirk to reflect that it only applies to
the X91F/L models.

Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Javier Martinez Canillas <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]

* entry: Fix noinstr warning in __enter_from_user_mode()

__enter_from_user_mode() is triggering noinstr warnings with
CONFIG_DEBUG_PREEMPT due to its call of preempt_count_add() via
ct_state().

The preemption disable isn't needed as interrupts are already disabled.
And the context_tracking_enabled() check in ct_state() also isn't needed
as that's already being done by the CT_WARN_ON().

Just use __ct_state() instead.

Fixes the following warnings:

  vmlinux.o: warning: objtool: enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: syscall_enter_from_user_mode+0xf9: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0xc7: call to preempt_count_add() leaves .noinstr.text section
  vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0xba: call to preempt_count_add() leaves .noinstr.text section

Fixes: 171476775d32 ("context_tracking: Convert state to atomic_t")
Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/d8955fa6d68dc955dda19baf13ae014ae27926f5.1677369694.git.jpoimboe@kernel.org

* sched/fair: Sanitize vruntime of entity being migrated

Commit 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
fixes an overflowing bug, but ignore a case that se->exec_start is reset
after a migration.

For fixing this case, we delay the reset of se->exec_start after
placing the entity which se->exec_start to detect long sleeping task.

In order to take into account a possible divergence between the clock_task
of 2 rqs, we increase the threshold to around 104 days.

Fixes: 829c1651e9c4 ("sched/fair: sanitize vruntime of entity being placed")
Originally-by: Zhang Qiao <[email protected]>
Signed-off-by: Vincent Guittot <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Zhang Qiao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

* perf/x86/amd/core: Always clear status for idx

The variable 'status' (which contains the unhandled overflow bits) is
not being properly masked in some cases, displaying the following
warning:

  WARNING: CPU: 156 PID: 475601 at arch/x86/events/amd/core.c:972 amd_pmu_v2_handle_irq+0x216/0x270

This seems to be happening because the loop is being continued before
the status bit being unset, in case x86_perf_event_set_period()
returns 0. This is also causing an inconsistency because the "handled"
counter is incremented, but the status bit is not cleaned.

Move the bit cleaning together above, together when the "handled"
counter is incremented.

Fixes: 7685665c390d ("perf/x86/amd/core: Add PerfMonV2 overflow handling")
Signed-off-by: Breno Leitao <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Sandipan Das <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

* entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up

RCU sometimes needs to perform a delayed wake up for specific kthreads
handling offloaded callbacks (RCU_NOCB).  These wakeups are performed
by timers and upon entry to idle (also to guest and to user on nohz_full).

However the delayed wake-up on kernel exit is actually performed after
the thread flags are fetched towards the fast path check for work to
do on exit to user. As a result, and if there is no other pending work
to do upon that kernel exit, the current task will resume to userspace
with TIF_RESCHED set and the pending wake up ignored.

Fix this with fetching the thread flags _after_ the delayed RCU-nocb
kthread wake-up.

Fixes: 47b8ff194c1f ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

* efi/libstub: zboot: Add compressed image to make targets

Avoid needlessly rebuilding the compressed image by adding the file
'vmlinuz' to the 'targets' Kbuild make variable.

Signed-off-by: Ard Biesheuvel <[email protected]>

* hwmon: (peci/cputemp) Fix miscalculated DTS for SKX

For Skylake, DTS temperature of the CPU is reported in S10.6 format
instead of S8.8.

Reported-by: Paul Fertser <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Iwona Winiarska <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>

* hwmon: fix potential sensor registration fail if of_node is missing

It is not sufficient to check of_node in current device.
In some cases, this would cause the sensor registration to fail.

This patch looks for device's ancestors to find a valid of_node if any.

Fixes: d560168b5d0f ("hwmon: (core) New hwmon registration API")
Signed-off-by: Phinex Hung <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>

* hwmon: (xgene) Fix ioremap and memremap leak

Smatch reports:

drivers/hwmon/xgene-hwmon.c:757 xgene_hwmon_probe() warn:
'ctx->pcc_comm_addr' from ioremap() not released on line: 757.

This is because in drivers/hwmon/xgene-hwmon.c:701 xgene_hwmon_probe(),
ioremap and memremap is not released, which may cause a leak.

To fix this, ioremap and memremap is modified to devm_ioremap and
devm_memremap.

Signed-off-by: Tianyi Jing <[email protected]>
Reviewed-by: Dongliang Mu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[groeck: Fixed formatting and subject]
Signed-off-by: Guenter Roeck <[email protected]>

* bootconfig: Fix testcase to increase max node

Since commit 6c40624930c5 ("bootconfig: Increase max nodes of bootconfig
from 1024 to 8192 for DCC support") increased the max number of bootconfig
node to 8192, the bootconfig testcase of the max number of nodes fails.
To fix this issue, we can not simply increase the number in the test script
because the test bootconfig file becomes too big (>32KB). To fix that, we
can use a combination of three alphabets (26^3 = 17576). But with that,
we can not express the 8193 (just one exceed from the limitation) because
it also exceeds the max size of bootconfig. So, the first 26 nodes will just
use one alphabet.

With this fix, test-bootconfig.sh passes all tests.

Link: https://lore.kernel.org/all/167888844790.791176.670805252426835131.stgit@devnote2/

Reported-by: Heinz Wiesinger <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
Fixes: 6c40624930c5 ("bootconfig: Increase max nodes of bootconfig from 1024 to 8192 for DCC support")
Signed-off-by: Masami Hiramatsu (Google) <[email protected]>
Reviewed-by: Steven Rostedt (Google) <[email protected]>

* keys: Do not cache key in task struct if key is requested from kernel thread

The key which gets cached in task structure from a kernel thread does not
get invalidated even after expiry.  Due to which, a new key request from
kernel thread will be served with the cached key if it's present in task
struct irrespective of the key validity.  The change is to not cache key in
task_struct when key requested from kernel thread so that kernel thread
gets a valid key on every key request.

The problem has been seen with the cifs module doing DNS lookups from a
kernel thread and the results getting pinned by being attached to that
kernel thread's cache - and thus not something that can be easily got rid
of.  The cache would ordinarily be cleared by notify-resume, but kernel
threads don't do that.

This isn't seen with AFS because AFS is doing request_key() within the
kernel half of a user thread - which will do notify-resume.

Fixes: 7743c48e54ee ("keys: Cache result of request_key*() temporarily in task_struct")
Signed-off-by: Bharath SM <[email protected]>
Signed-off-by: David Howells <[email protected]>
Reviewed-by: Jarkko Sakkinen <[email protected]>
cc: Shyam Prasad N <[email protected]>
cc: Steve French <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
Link: https://lore.kernel.org/r/CAGypqWw951d=zYRbdgNR4snUDvJhWL=q3=WOyh7HhSJupjz2vA@mail.gmail.com/

* verify_pefile: relax wrapper length check

The PE Format Specification (section "The Attribute Certificate Table
(Image Only)") states that `dwLength` is to be rounded up to 8-byte
alignment when used for traversal.  Therefore, the field is not required
to be an 8-byte multiple in the first place.

Accordingly, pesign has not performed this alignment since version
0.110.  This causes kexec failure on pesign'd binaries with "PEFILE:
Signature wrapper len wrong".  Update the comment and relax the check.

Signed-off-by: Robbie Harwood <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: Jarkko Sakkinen <[email protected]>
cc: Eric Biederman <[email protected]>
cc: Herbert Xu <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
Link: https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#the-attribute-certificate-table-image-only
Link: https://github.com/rhboot/pesign
Link: https://lore.kernel.org/r/[email protected]/ # v2

* asymmetric_keys: log on fatal failures in PE/pkcs7

These particular errors can be encountered while trying to kexec when
secureboot lockdown is in place.  Without this change, even with a
signed debug build, one still needs to reboot the machine to add the
appropriate dyndbg parameters (since lockdown blocks debugfs).

Accordingly, upgrade all pr_debug() before fatal error into pr_warn().

Signed-off-by: Robbie Harwood <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: Jarkko Sakkinen <[email protected]>
cc: Eric Biederman <[email protected]>
cc: Herbert Xu <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]/ # v2

* ice: fix rx buffers handling for flow director packets

Adding flow director filters stopped working correctly after
commit 2fba7dc5157b ("ice: Add support for XDP multi-buffer
on Rx side"). As a result, only first flow director filter
can be added, adding next filter leads to NULL pointer
dereference attached below.

Rx buffer handling and reallocation logic has been optimized,
however flow director specific traffic was not accounted for.
As a result driver handled those packets incorrectly since new
logic was based on ice_rx_ring::first_desc which was not set
in this case.

Fix this by setting struct ice_rx_ring::first_desc to next_to_clean
for flow director received packets.

[  438.544867] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  438.551840] #PF: supervisor read access in kernel mode
[  438.556978] #PF: error_code(0x0000) - not-present page
[  438.562115] PGD 7c953b2067 P4D 0
[  438.565436] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  438.569794] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 6.2.0-net-bug #1
[  438.577531] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022
[  438.588470] RIP: 0010:ice_clean_rx_irq+0x2b9/0xf20 [ice]
[  438.593860] Code: 45 89 f7 e9 ac 00 00 00 8b 4d 78 41 31 4e 10 41 09 d5 4d 85 f6 0f 84 82 00 00 00 49 8b 4e 08 41 8b 76
1c 65 8b 3d 47 36 4a 3f <48> 8b 11 48 c1 ea 36 39 d7 0f 85 a6 00 00 00 f6 41 08 02 0f 85 9c
[  438.612605] RSP: 0018:ff8c732640003ec8 EFLAGS: 00010082
[  438.617831] RAX: 0000000000000800 RBX: 00000000000007ff RCX: 0000000000000000
[  438.624957] RDX: 0000000000000800 RSI: 0000000000000000 RDI: 0000000000000000
[  438.632089] RBP: ff4ed275a2158200 R08: 00000000ffffffff R09: 0000000000000020
[  438.639222] R10: 0000000000000000 R11: 0000000000000020 R12: 0000000000001000
[  438.646356] R13: 0000000000000000 R14: ff4ed275d0daffe0 R15: 0000000000000000
[  438.653485] FS:  0000000000000000(0000) GS:ff4ed2738fa00000(0000) knlGS:0000000000000000
[  438.661563] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  438.667310] CR2: 0000000000000000 CR3: 0000007c9f0d6006 CR4: 0000000000771ef0
[  438.674444] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  438.681573] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  438.688697] PKRU: 55555554
[  438.691404] Call Trace:
[  438.693857]  <IRQ>
[  438.695877]  ? profile_tick+0x17/0x80
[  438.699542]  ice_msix_clean_ctrl_vsi+0x24/0x50 [ice]
[  438.702571] ice 0000:b1:00.0: VF 1: ctrl_vsi irq timeout
[  438.704542]  __handle_irq_event_percpu+0x43/0x1a0
[  438.704549]  handle_irq_event+0x34/0x70
[  438.704554]  handle_edge_irq+0x9f/0x240
[  438.709901] iavf 0000:b1:01.1: Failed to add Flow Director filter with status: 6
[  438.714571]  __common_interrupt+0x63/0x100
[  438.714580]  common_interrupt+0xb4/0xd0
[  438.718424] iavf 0000:b1:01.1: Rule ID: 127 dst_ip: 0.0.0.0 src_ip 0.0.0.0 UDP: dst_port 4 src_port 0
[  438.722255]  </IRQ>
[  438.722257]  <TASK>
[  438.722257]  asm_common_interrupt+0x22/0x40
[  438.722262] RIP: 0010:cpuidle_enter_state+0xc8/0x430
[  438.722267] Code: 6e e9 25 ff e8 f9 ef ff ff 8b 53 04 49 89 c5 0f 1f 44 00 00 31 ff e8 d7 f1 24 ff 45
84 ff 0f 85 57 02 00 00 fb 0f 1f 44 00 00 <45> 85 f6 0f 88 85 01 00 00 49 63 d6 48 8d 04 52 48 8d 04 82 49 8d
[  438.722269] RSP: 0018:ffffffff86003e50 EFLAGS: 00000246
[  438.784108] RAX: ff4ed2738fa00000 RBX: ffbe72a64fc01020 RCX: 0000000000000000
[  438.791234] RDX: 0000000000000000 RSI: ffffffff858d84de RDI: ffffffff85893641
[  438.798365] RBP: 0000000000000002 R08: 0000000000000002 R09: 000000003158af9d
[  438.805490] R10: 0000000000000008 R11: 0000000000000354 R12: ffffffff862365a0
[  438.812622] R13: 000000661b472a87 R14: 0000000000000002 R15: 0000000000000000
[  438.819757]  cpuidle_enter+0x29/0x40
[  438.823333]  do_idle+0x1b6/0x230
[  438.826566]  cpu_startup_entry+0x19/0x20
[  438.830492]  rest_init+0xcb/0xd0
[  438.833717]  arch_call_rest_init+0xa/0x30
[  438.837731]  start_kernel+0x776/0xb70
[  438.841396]  secondary_startup_64_no_verify+0xe5/0xeb
[  438.846449]  </TASK>

Fixes: 2fba7dc5157b ("ice: Add support for XDP multi-buffer on Rx side")
Signed-off-by: Piotr Raczynski <[email protected]>
Acked-by: Maciej Fijalkowski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

* ice: check if VF exists before mode check

Setting trust on VF should return EINVAL when there is no VF. Move
checking for switchdev mode after checking if VF exists.

Fixes: c54d209c78b8 ("ice: Wait for VF to be reset/ready before configuration")
Signed-off-by: Michal Swiatkowski <[email protected]>
Signed-off-by: Kalyan Kodamagula <[email protected]>
Tested-by: Sujai Buvaneswaran <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

* ice: remove filters only if VSI is deleted

Filters shouldn't be removed in VSI rebuild path. Removing them on PF
VSI results in no rule for PF MAC after changing for example queues
amount.

Remove all filters only in the VSI remove flow. As unload should also
cause the filter to be removed introduce, a new function ice_stop_eth().
It will unroll ice_start_eth(), so remove filters and close VSI.

Fixes: 6624e780a577 ("ice: split ice_vsi_setup into smaller functions")
Signed-off-by: Michal Swiatkowski <[email protected]>
Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>

* iavf: fix hang on reboot with ice

When a system with E810 with existing VFs gets rebooted the following
hang may be observed.

 Pid 1 is hung in iavf_remove(), part of a network driver:
 PID: 1        TASK: ffff965400e5a340  CPU: 24   COMMAND: "systemd-shutdow"
  #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
  #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
  #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
  #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
  #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
  #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
  #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
  #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
  #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
  #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
 #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
 #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
 #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
 #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
 #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
     RIP: 00007f1baa5c13d7  RSP: 00007fffbcc55a98  RFLAGS: 00000202
     RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f1baa5c13d7
     RDX: 0000000001234567  RSI: 0000000028121969  RDI: 00000000fee1dead
     RBP: 00007fffbcc55ca0   R8: 0000000000000000   R9: 00007fffbcc54e90
     R10: 00007fffbcc55050  R11: 0000000000000202  R12: 0000000000000005
     R13: 0000000000000000  R14: 00007fffbcc55af0  R15: 0000000000000000
     ORIG_RAX: 00000000000000a9  CS: 0033  SS: 002b

During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.

Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().

Fixes: 974578017fc1 ("iavf: Add waiting so the port is initialized in remove")
Fixes: a8417330f8a5 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <[email protected]>
Signed-off-by: Stefan Assmann <[email protected]>
Reviewed-by: Michal Kubiak <[email protected]>
Tested-by: Rafal Romanowski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>

* i40e: fix flow director packet filter programming

Initialize to zero structures to build a valid
Tx Packet used for the filter programming.

Fixes: a9219b332f52 ("i40e: VLAN field for flow director")
Signed-off-by: Radoslaw Tyl <[email protected]>
Reviewed-by: Michal Swiatkowski <[email protected]>
Tested-by: Arpana Arland <[email protected]> (A Contingent worker at Intel)
Sign…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.