When new vcpu file descriptors are created and bound to the new kvm file
descriptor as a part of the confidential guest reset mechanism, various
subsystems needs to know about it. This change adds notifiers so that various
subsystems can take appropriate actions when vcpu fds change by registering
their handlers to this notifier.
Subsequent changes will register specific handlers to this notifier.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-31-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Confidential guests needs to generate a new KVM file descriptor upon virtual
machine reset. Existing VCPUs needs to be reattached to this new
KVM VM file descriptor. As a part of this, new VCPU file descriptors against
this new KVM VM file descriptor needs to be created and re-initialized.
Resources allocated against the old VCPU fds needs to be released. This change
makes this happen.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-16-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Various subsystems might need to take some steps before the KVM file descriptor
for a virtual machine is changed. So a new boolean attribute is added to the
vmfd_notifier structure which is passed to the notifier callbacks.
vmfd_notifer.pre is true for pre-notification of vmfd change and false for
post notification. Notifier callback implementations can simply check
the boolean value for (vmfd_notifer*)->pre and can take actions for pre or
post vmfd change based on the value.
Subsequent patches will add callback implementations for specific components
that need this pre-notification.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-9-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A notifier callback can be used by various subsystems to perform actions when
KVM file descriptor for a virtual machine changes as a part of confidential
guest reset process. This change adds this notifier mechanism. Subsequent
patches will add specific implementations for various notifier callbacks
corresponding to various subsystems that need to take action when KVM VM file
descriptor changed.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-8-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This change adds common kvm specific support to handle KVM VM file descriptor
change. KVM VM file descriptor can change as a part of confidential guest reset
mechanism. A new function api kvm_arch_on_vmfd_change() per
architecture platform is added in order to implement architecture specific
changes required to support it. A subsequent patch will add x86 specific
implementation for kvm_arch_on_vmfd_change() as currently only x86 supports
confidential guest reset.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Link: https://lore.kernel.org/r/20260225035000.385950-6-anisinha@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Nitro Enclaves are a confidential compute technology which
allows a parent instance to carve out resources from itself
and spawn a confidential sibling VM next to itself. Similar
to other confidential compute solutions, this sibling is
controlled by an underlying vmm, but still has a higher level
vmm (QEMU) to implement some of its I/O functionality and
lifecycle.
Add an accelerator to drive this interface. In combination with
follow-on patches to enhance the Nitro Enclaves machine model, this
will allow users to run a Nitro Enclave using QEMU.
Signed-off-by: Alexander Graf <graf@amazon.com>
Link: https://lore.kernel.org/r/20260225220807.33092-5-graf@amazon.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This change removes userland code that worked around a restriction
in the mshv driver in the 6.18 kernel: regions from userland
couldn't be mapped to multiple regions in the kernel. We maintained a
shadow mapping table in qemu and used a heuristic to swap in a requested
region in case of UNMAPPED_GPA exits.
However, this heuristic wasn't reliable in all cases, since HyperV
behaviour is not 100% reliable across versions. HyperV itself doesn't
prohibit to map regions at multiple places into the guest, so the
restriction has been removed in the mshv driver.
Hence we can remove the remapping code. Effectively this will mandate a
6.19 kernel, if the workload attempt to map e.g. BIOS to multiple
reagions. I still think it's the right call to remove this logic:
- The workaround only seems to work reliably with a certain revision
of HyperV as a nested hypervisor.
- We expect Direct Virtualization (L1VH) to be the main platform for
the mshv accelerator, which also requires a 6.19 kernel
This reverts commit efc4093358.
Signed-off-by: Magnus Kulke <magnuskulke@linux.microsoft.com>
Acked-by: Wei Liu (Microsoft) <wei.liu@kernel.org>
Tested-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Link: https://lore.kernel.org/r/20260113153708.448968-1-magnuskulke@linux.microsoft.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Change terminology to match the KVM one, as APIC is x86-specific.
And move out whpx_irqchip_in_kernel() to make it usable from common
code even when not compiling with WHPX support.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As of why: WHPX on arm64 doesn't have debug trap support as of today.
Keep the exception bitmap interface for now - despite that being entirely unavailable on arm64 too.
Signed-off-by: Mohamed Mediouni <mohamed@unpredictable.fr>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This uninitialized value violates the contract in the
documentation comment, and may lead to a SEGV during
translaton with -d in_asm.
Change the documentation to disallow hostp NULL.
Pass hostp to probe_access_internal directly.
Reported-by: Panda Jiang <3160104094@zju.edu.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pass a dummy variable instead to let the value be discarded,
in preparation for making the argument mandatory.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
'CPUTLBEntryFull.xlat_section' stores section_index in last 12 bits to
find the correct section when CPU access the IO region over the IOTLB.
However, section_index is only unique inside single AddressSpace. If
address space translation is over IOMMUMemoryRegion, it could return
section from other AddressSpace. 'iotlb_to_section()' API only finds the
sections from CPU's AddressSpace so that it couldn't find section in
other AddressSpace. Thus, using 'iotlb_to_section()' API will find the
wrong section and QEMU will have wrong load/store access.
To fix this bug of iotlb_to_section(), store complete MemoryRegionSection
pointer in CPUTLBEntryFull to replace the section_index in xlat_section.
Rename 'xlat_section' to 'xlat' as we remove last 12 bits section_index
inside. Also, since we directly use section pointer in the
CPUTLBEntryFull (full->section), we can remove the unused functions:
iotlb_to_section(), memory_region_section_get_iotlb().
This bug occurs only when
(1) IOMMUMemoryRegion is in the path of CPU access.
(2) IOMMUMemoryRegion returns different target_as and the section is in
the IO region.
Common IOMMU devices don't have this issue since they are only in the
path of DMA access. Currently, the bug only occurs when ARM MPC device
(hw/misc/tz-mpc.c) returns 'blocked_io_as' to emulate blocked access
handling. Upcoming RISC-V wgChecker [1] and IOPMP [2] devices are also
affected by this bug.
[1] RISC-V WG:
https://patchew.org/QEMU/20251021155548.584543-1-jim.shu@sifive.com/
[2] RISC-V IOPMP:
https://patchew.org/QEMU/20250312093735.1517740-1-ethan84@andestech.com/
Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Mark Burton <mburton@qti.qualcomm.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260128152348.2095427-3-jim.shu@sifive.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
This commit was created with scripts/clean-includes:
./scripts/clean-includes '--git' 'mshv' 'accel/mshv' 'target/i386/mshv' 'include/system/mshv.h'
All .c should include qemu/osdep.h first. The script performs three
related cleanups:
* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c already includes
it. Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
Drop these, too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20260116125830.926296-2-peter.maydell@linaro.org
CONFIG_ATOMIC64 is a configuration knob for 32-bit hosts.
This allows removal of functions like load_atomic8_or_exit
and simplification of load_atom_extract_al8_or_exit to
load_atom_extract_al8.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Remove instances of __i386__, except from tests and imported headers.
Drop a block containing sanity check and fprintf error message for
i386-on-i386 or x86_64-on-x86_64 emulation. If we really want
something like this, we would do it via some form of compile-time check.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Currently we can not build files including "exec/watchpoint.h"
as meson common objects because the CONFIG_USER_ONLY definition
is poisoned. We can easily fix that by un-inlining the
user-emulation stubs.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260106231908.16756-5-philmd@linaro.org>