'CPUTLBEntryFull.xlat_section' stores section_index in last 12 bits to
find the correct section when CPU access the IO region over the IOTLB.
However, section_index is only unique inside single AddressSpace. If
address space translation is over IOMMUMemoryRegion, it could return
section from other AddressSpace. 'iotlb_to_section()' API only finds the
sections from CPU's AddressSpace so that it couldn't find section in
other AddressSpace. Thus, using 'iotlb_to_section()' API will find the
wrong section and QEMU will have wrong load/store access.
To fix this bug of iotlb_to_section(), store complete MemoryRegionSection
pointer in CPUTLBEntryFull to replace the section_index in xlat_section.
Rename 'xlat_section' to 'xlat' as we remove last 12 bits section_index
inside. Also, since we directly use section pointer in the
CPUTLBEntryFull (full->section), we can remove the unused functions:
iotlb_to_section(), memory_region_section_get_iotlb().
This bug occurs only when
(1) IOMMUMemoryRegion is in the path of CPU access.
(2) IOMMUMemoryRegion returns different target_as and the section is in
the IO region.
Common IOMMU devices don't have this issue since they are only in the
path of DMA access. Currently, the bug only occurs when ARM MPC device
(hw/misc/tz-mpc.c) returns 'blocked_io_as' to emulate blocked access
handling. Upcoming RISC-V wgChecker [1] and IOPMP [2] devices are also
affected by this bug.
[1] RISC-V WG:
https://patchew.org/QEMU/20251021155548.584543-1-jim.shu@sifive.com/
[2] RISC-V IOPMP:
https://patchew.org/QEMU/20250312093735.1517740-1-ethan84@andestech.com/
Signed-off-by: Jim Shu <jim.shu@sifive.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Mark Burton <mburton@qti.qualcomm.com>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-ID: <20260128152348.2095427-3-jim.shu@sifive.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Guard the native endian definition we want to remove by surrounding
it with TARGET_NOT_USING_LEGACY_NATIVE_ENDIAN_API #ifdef'ry.
Assign values to the enumerators so they stay unchanged.
Once a target gets cleaned we'll set the definition in the target
config, then the target won't be able to use the legacy API anymore.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20260109165058.59144-21-philmd@linaro.org>
Guard the native endian APIs we want to remove by surrounding
them with TARGET_NOT_USING_LEGACY_NATIVE_ENDIAN_API #ifdef'ry.
Once a target gets cleaned we'll set the definition in the
target config, then the target won't be able to use the legacy
API anymore.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20260109165058.59144-16-philmd@linaro.org>
All the LD/ST[W,L,Q] variants use the same template, only
modifying the access size used. Unify as a single pair of
LD/ST methods taking a MemOp argument. Thus use the 'm'
suffix for MemOp.
Keep the pre-existing "warning: addr must be aligned" comment.
We leave the wonder about why we aren't asserting alignment
for later.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20260109165058.59144-11-philmd@linaro.org>
We have 115 direct inclusions of "system/memory.h", and 91 headers
in include/ use it: hundreds of files have to process it.
However only one single header really uses the MemoryRegionCache
API: "hw/virtio/virtio-access.h". Split it out to a new header,
avoiding processing unused inlined functions hundreds of times.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-ID: <20260109165058.59144-6-philmd@linaro.org>
Error reporting patches for 2026-01-07
# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCgAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmlfU/wSHGFybWJydUBy
# ZWRoYXQuY29tAAoJEDhwtADrkYZToiUP/RW1I1wFyescGOpUxBjlXqbkgZvrdRbL
# BxtTcoCW0q/cc1Fv3CSYZMm+vvwWJyAysnYONDu6ldDl9ojKfGT/gi1Tgp/99/r4
# bgLbAvExRbbyPOkPBtoXCYeobmgaDP9pMHHlVcdFQrW9hmQiEl4QSWPImmNrKEk2
# gV9SZJ737k9n5dq4XLbqlHXKspn4lWiUE9hbHIUrKWZDn0LDdr5z2wkjhZCmuCR2
# mRSgJhc68Lnb1LdBdRo/5PlG6Hw3jvLat4+q+42teN/aI6zJbD9yKocgaGtubVv1
# h4ntJPMvDKD7DRZF06k8crpLMXJjZFztVr30XBE/e7wG+xY34+3tho3iCQN1vTFe
# RBJne0FaRPGSNYpF8Tj7lPIr0kduqk3/lOQ9HPobTroIPTrCcRXbdOeQ/Ed/Cjrk
# suh8t4OGmy0ThcsUsAajSjPDw2aFlitCS4pWNaSctTvR7V+2trol+WS2QO4My0MX
# 4Z3BnOHBnhE/xo+22T4FW3NvNcFKsQ5Tlq6mjjAgFJ/guaJ2TbMFe/Pm9TtzcPHj
# 7mhBBvKStNWFrQz66z7+hxJhOuOEmON8i4coADDPTUWmcICCyjtJW5m5f+PoYYHr
# LpFwIFHWuKtSAwWQKReOAmA2p0gx1FNZX9eGCl/4IQ54/tLP2zJ07t6LwAl6fn6t
# mKXChdbC9L7p
# =CY68
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 08 Jan 2026 05:51:40 PM AEDT
# gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg: issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [unknown]
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* tag 'pull-error-2026-01-07-v2' of https://repo.or.cz/qemu/armbru:
block/file-win32: Improve an error message
qga/commands-win32: Use error_setg_win32() for better error messages
error: Use error_setg_errno() for simplicity and consistency
error: Use error_setg_errno() to improve error messages
net/slirp: Improve file open error message
error: Use error_setg_file_open() for simplicity and consistency
blkdebug: Use error_setg_file_open() for a better error message
net/tap: Use error_setg_file_open() for a better error message
qga: Use error_setg_file_open() for better error messages
tap-solaris: Use error_setg_file_open() for better error messages
ui: Convert to qemu_create() for simplicity and consistency
error: Strip trailing '\n' from error string arguments (again)
error: Consistently name Error * objects err, and not errp
error: error_free(NULL) is safe, drop unnecessary conditionals
nbd/client-connection: Replace error_propagate() by assignment
hw/nvram/xlnx-bbram: More idiomatic and simpler error reporting
hw/core/loader: Make load_elf_hdr() return bool, simplify caller
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use error_setg_errno() instead of passing the value of strerror() or
g_strerror() to error_setg().
The separator between the error message proper and the value of
strerror() changes from " : ", "", " - ", "- " to ": " in places.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20251121121438.1249498-14-armbru@redhat.com>
Acked-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
qdev_print_props() retrieves a property's value from its legacy
property if it exists. A legacy property is created by
qdev_class_add_legacy_property() when the property has a print()
method or does not have a get() method.
If it has a print() method, the legacy property's value is obtained
from the property's print() method. This is used to format PCI
addresses nicely, i.e. like 01.3 instead of 11.
Else, if doesn't have a get() method, the legacy property is
unreadable. "info qtree" silently skips unreadable properties.
Link properties don't have a get() method, and are therefore skipped.
This is wrong, because the underlying QOM property *is* readable.
Change qdev_print_props() to simply use a print() method directly if
it exists, else get the value via QOM.
"info qtree" now shows links fine. For instance, machine "pc" onboard
device "PIIX4_PM" property "bus" is now visible.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20251022101420.36059-3-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
stl_phys_notdirty() was added in commit 8df1cd076c ("physical memory
access functions") as a (premature?) optimisation for the CODE path.
Meanwhile 20 years passed, we might never have understood / used it
properly; the code evolved and now the recommended way to access the
CODE path is via the cpu_ld/st_mmu*() API.
Remove both address_space_stl_notdirty() and stl_phys_notdirty()
leftovers.
Suggested-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20251224151351.86733-5-philmd@linaro.org>
* cleanup include/hw headers
* cleanup memory headers
* rust: preludes
* rust: support for dtrace
* rust/hpet: first part of reorganization
* meson: small cleanups
* target/i386: Diamond Rapids CPU model including CET, APX, AVX10.2
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCgAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmlPov8UHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroPN1wf9HCceQ1273g7HbNeamay2bSaqypyM
# sEUBk4ipwO0dp7AYaaX5MeJ8NxeYcK82oFgm35WLY1tMOv0BZG5ez02dLoh5C4fb
# Bmy3kV1aY9cxF0IwTyD4dIADlZoaMnGgMElUKFY2/EixjxOUMLe90b1MO2KczqFa
# jvC4gmjx5PC1r+BHycSEdKm2Rbunueb/5eSkKeyTX7rjxQ/Eij0uGjrWrZkMWtgs
# ERJ2xo+D6a38w/uJ88KuqUV1BqYxNNwKmvOwVBU2xFB9o9bm20TNOJZ3+D+Ki8Aj
# idv+rU0XY1bWseo4USuozsqxfkjLJ5lj2YYUkSVO/I1wJmuO7Bq6xzrCxg==
# =/nIt
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 27 Dec 2025 08:12:31 PM AEDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [unknown]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (152 commits)
block: rename block/aio-wait.h to qemu/aio-wait.h
block: rename block/aio.h to qemu/aio.h
block: reduce files included by block/aio.h
block: extract include/qemu/aiocb.h out of include/block/aio.h
hw: add missing includes hidden by block/aio.h
qmp: Fix thread race
thread-pool: Fix thread race
dosc/cpu-models-x86: Add documentation for DiamondRapids
i386/cpu: Add CPU model for Diamond Rapids
i386/cpu: Define dependency for VMX_VM_ENTRY_LOAD_IA32_FRED
i386/cpu: Add an option in X86CPUDefinition to control CPUID 0x1f
i386/cpu: Allow cache to be shared at thread level
i386/cpu: Allow unsupported avx10_version with x-force-features
i386/cpu: Add a helper to get host avx10 version
i386/cpu: Support AVX10.2 with AVX10 feature models
i386/cpu: Add support for AVX10_VNNI_INT in CPUID enumeration
i386/cpu: Add CPUID.0x1E.0x1 subleaf for AMX instructions
i386/cpu: Add support for MOVRS in CPUID enumeration
run: introduce a script for running devel commands
gitlab-ci: enable rust for msys2-64bit
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move RAMBlock functions out of ram_addr.h and cpu-common.h;
move memory API headers out of include/exec and into include/system.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In our long-term experience in Bytedance, we've found that under
the same load, live migration of larger VMs with more devices is
often more difficult to converge (requiring a larger downtime limit).
Through some testing and calculations, we conclude that bitmap sync time
affects the calculation of live migration bandwidth.
When the addresses processed are not aligned, a large number of
clear_dirty ioctl occur (e.g. a 4MB misaligned memory can generate
2048 clear_dirty ioctls from two different memory_listener),
which increases the time required for bitmap_sync and makes it
more difficult for dirty pages to converge.
For a 64C256G vm with 8 vhost-user-net(32 queue per nic) and
16 vhost-user-blk(4 queue per blk), the sync time is as high as *73ms*
(tested with 10GBps dirty rate, the sync time increases as the dirty
page rate increases), Here are each part of the sync time:
- sync from kvm to ram_list: 2.5ms
- vhost_log_sync:3ms
- sync aligned memory from ram_list to RAMBlock: 5ms
- sync misaligned memory from ram_list to RAMBlock: 61ms
Attempt to merge those fragmented clear_dirty ioctls, then syncing
misaligned memory from ram_list to RAMBlock takes only about 1ms,
and the total sync time is only *12ms*.
Signed-off-by: Chuang Xu <xuchuangxclwt@bytedance.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20251218114220.83354-1-xuchuangxclwt@bytedance.com
[peterx: drop var "offset" in physical_memory_sync_dirty_bitmap]
Signed-off-by: Peter Xu <peterx@redhat.com>
Currently the code that reads the qtest protocol commands insists
that every input line has a command. If it receives a line with
nothing but whitespace it will trip an assertion in
qtest_process_command().
This is a little awkward for the case where we are feeding qtest a
set of bug-reproduction commands via standard input or a file,
because it means you need to be careful not to leave a blank line at
the start or the end when cutting and pasting the command sequence
from a bug report.
Change the code to allow and ignore blank lines in the input.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20251106151959.1088095-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
In the qtest_event() QEMUChrEvent handler, we create a timer
and log OPENED on CHR_EVENT_OPENED, and we destroy the timer and
log CLOSED on CHR_EVENT_CLOSED. However, the chardev subsystem
can send us more than one CHR_EVENT_CLOSED if we're reading from
a file chardev:
* the first one happens when we read the last data from the file
* the second one happens when the user hits ^C to exit QEMU
and the chardev is finalized: char_fd_finalize()
This causes us to call g_timer_elapsed() with a NULL timer
(which glib complains about) and print an extra CLOSED log line
with a zero timestamp:
[I +0.063829] CLOSED
qemu-system-aarch64: GLib: g_timer_elapsed: assertion 'timer != NULL' failed
[I +0.000000] CLOSED
Avoid this by ignoring a CHR_EVENT_CLOSED if we have already
processed one.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Message-ID: <20251107174306.1408139-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
mem + migration pull for 10.2
- Fabiano's patch to fix snapshot crash by rejecting some caps
- Marco's mapped-ram support on snapshot save/load
- Steve's cpr maintainers entry update on retirement
- Peter's coverity fixes
- Chenyi's tdx fix on hugetlbfs regression
- Peter's doc update on migrate resume flag
- Peter's doc update on HMP set parameter for cpr-exec-command's char** parsing
- Xiaoyao's guest-memfd fix for enabling shmem
- Arun's fix on error_fatal regression for migration errors
- Bin's fix on redundant error free for add block failures
- Markus's cleanup around MigMode sets
- Peter's two patches (out of loadvm threadify) to cleanup qio read peek process
- Thomas's vmstate-static-checker update for possible deprecation of argparse use
- Stefan's fix on windows deadlock by making unassigned MMIOs lockless
# -----BEGIN PGP SIGNATURE-----
#
# iIgEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCaQkZPBIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wZhTgEA8eCBMpM7PusNSdzzeIygKnIp2A8I70ca
# eIJz3ZM+FiUBAPVDrIZ59EhZA6NPcJb8Ya9OY4lT63F4BxrvN+f+uG4N
# =GUBi
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 03 Nov 2025 10:06:04 PM CET
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [unknown]
# gpg: aka "Peter Xu <peterx@redhat.com>" [unknown]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'staging-pull-request' of https://gitlab.com/peterx/qemu: (36 commits)
migration: Introduce POSTCOPY_DEVICE state
migration: Make postcopy listen thread joinable
migration: Respect exit-on-error when migration fails before resuming
migration: Refactor all incoming cleanup info migration_incoming_destroy()
migration: Introduce postcopy incoming setup and cleanup functions
migration: Move postcopy_ram_listen_thread() to postcopy-ram.c
migration: Do not try to start VM if disk activation fails
migration: Flush migration channel after sending data of CMD_PACKAGED
system/physmem: mark io_mem_unassigned lockless
scripts/vmstate-static-checker: Fix deprecation warnings with latest argparse
migration: vmsd errp handlers: return bool
migration/vmstate: stop reporting error number for new _errp APIs
tmp_emulator: improve and fix use of errp
migration: vmstate_save_state_v(): fix error path
migration: Properly wait on G_IO_IN when peeking messages
io: Add qio_channel_wait_cond() helper
migration: Put Error **errp parameter last
migration: Use bitset of MigMode instead of variable arguments
migration: Use unsigned instead of int for bit set of MigMode
migration: Don't free the reason after calling migrate_add_blocker
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge crypto and other misc fixes / features
* Increase minimum gnutls to 3.7.5
* Increase minimum libgcrypt to 1.9.4
* Increase minimum nettle to 3.7.3
* Drop obsolete in-tree XTS impl
* Fix memory leak when loading certificates
* Remove/reduce duplication when loading certifcates
* Fix possible crash when certificates are unloaded
while an active TLS connection is using when in a
TLS handshake operation
* Deprecate use of dh-params.pem file
* Document how to create certificates with Post-Quantum
Cryptography compliant algorithms.
* Support loading multiple certificate identities to
allow support for Post-Quantum crypto in parallel
with traditional RSA/ECC
* Add "-run-with exit-with-parent=on" parameter
* Flush pending errors when seeing ENOBUFS with
a zero-copy send attempt
* Fix data buffer parameters in hash & IO channel APIs
to use 'void *'
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE2vOm/bJrYpEtDo4/vobrtBUQT98FAmkIr/8ACgkQvobrtBUQ
# T9+2RhAAhEak/krdlTJw8OlJonUop7G5mlLU2TEoX0duRORcFhScsdSwb2pyc/wM
# tnwfWXsnsKFItJx1y3STkOICtdNqizGoU3+c7wl4anQBurydu+XTs4ESBtVJtMYr
# 1lTYvp0HFyKvaXwDWKE+ztltlJiog51tHPDLUIBCnyJysLVqxCHMHmkbG46IPBZo
# A2XXxp3j/VBPmhls0JHpbAD4iVE3PChdK7zhyeGe/rld9+0JA12EPCvZ5Uokdj41
# aYP/okvnVH1atucoygPdDE3P5GYBKaSXZUWqzfkKhU7FgaF2863Td7ff1ip+WyWN
# FFPNEU1hVg+T5hfsZVQmmIFDdSJWqoZaZM/WJVYdrRY4dKUCPnJ9OINbbnhuWz5E
# JFmZOPibRZKQ44XcHX49JRfJEBvoq1z9OT1r7HkEP4D9/O7V/riIunbAESMk0sgi
# 0/fatvdhNKMN6YBQM3mtN3yNOcfRSWFtSy9XS9zDjdpEKT7ui2t9FC0ZNSP0FRkS
# aTY31FyacjHwU3zaoh6NoqqpxV9wwHrgsJwNbA/IztjmX/jvGG0Gb/sXVEqM59tR
# e3VWTmlmZ1T8OLImh1hG4t+nY+XzI64QpVX8H9RCGm21o28DyTcOnTFK4OyIfWe5
# ttnNfEJN8WCVCsA8tcM8yAbZ/0qXrYfiZSO7hq79wE7LvyholAQ=
# =9ESG
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 03 Nov 2025 02:37:03 PM CET
# gpg: using RSA key DAF3A6FDB26B62912D0E8E3FBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [unknown]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* tag 'next-pr-pull-request' of https://gitlab.com/berrange/qemu: (32 commits)
docs: creation of x509 certs compliant with post-quantum crypto
crypto: support upto 5 parallel certificate identities
crypto: expand logic to cope with multiple certificate identities
crypto: avoid loading the identity certs twice
crypto: avoid loading the CA certs twice
crypto: deprecate use of external dh-params.pem file
crypto: make TLS credentials structs private
crypto: fix lifecycle handling of gnutls credentials objects
crypto: introduce a wrapper around gnutls credentials
crypto: introduce method for reloading TLS creds
crypto: reduce duplication in handling TLS priority strings
crypto: remove duplication loading x509 CA cert
crypto: shorten the endpoint == server check in TLS creds
crypto: move release of DH parameters into TLS creds parent
crypto: remove needless indirection via parent_obj field
crypto: use g_autofree when loading x509 credentials
crypto: move check for TLS creds 'dir' property
crypto: remove redundant access() checks before loading certs
crypto: replace stat() with access() for credential checks
crypto: add missing free of certs array
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When the Bus Master bit is disabled in a PCI device's Command Register,
the device's DMA address space becomes unassigned memory (i.e. the
io_mem_unassigned MemoryRegion).
This can lead to deadlocks with IOThreads since io_mem_unassigned
accesses attempt to acquire the Big QEMU Lock (BQL). For example,
virtio-pci devices deadlock in virtio_write_config() ->
virtio_pci_stop_ioeventfd() when waiting for the IOThread while holding
the BQL. The IOThread is unable to acquire the BQL but the vcpu thread
won't release the BQL while waiting for the IOThread.
io_mem_unassigned is trivially thread-safe since it has no state, it
simply rejects all load/store accesses. Therefore it is safe to enable
lockless I/O on io_mem_unassigned to eliminate this deadlock.
Here is the backtrace described above:
Thread 9 (Thread 0x7fccfcdff6c0 (LWP 247832) "CPU 4/KVM"):
#0 0x00007fcd11529d46 in ppoll () from target:/lib64/libc.so.6
#1 0x000056468a1a9bad in ppoll (__fds=<optimized out>, __nfds=<optimized out>, __timeout=0x0, __ss=0x0) at /usr/include/bits/poll2.h:88
#2 0x000056468a18f9d9 in fdmon_poll_wait (ctx=0x5646c6a1dc30, ready_list=0x7fccfcdfb310, timeout=-1) at ../util/fdmon-poll.c:79
#3 0x000056468a18f14f in aio_poll (ctx=<optimized out>, blocking=blocking@entry=true) at ../util/aio-posix.c:730
#4 0x000056468a1ad842 in aio_wait_bh_oneshot (ctx=<optimized out>, cb=cb@entry=0x564689faa420 <virtio_blk_ioeventfd_stop_vq_bh>, opaque=<optimized out>) at ../util/aio-wait.c:85
#5 0x0000564689faaa89 in virtio_blk_stop_ioeventfd (vdev=0x5646c8fd7e90) at ../hw/block/virtio-blk.c:1644
#6 0x0000564689d77880 in virtio_bus_stop_ioeventfd (bus=bus@entry=0x5646c8fd7e08) at ../hw/virtio/virtio-bus.c:264
#7 0x0000564689d780db in virtio_bus_stop_ioeventfd (bus=bus@entry=0x5646c8fd7e08) at ../hw/virtio/virtio-bus.c:256
#8 0x0000564689d7d98a in virtio_pci_stop_ioeventfd (proxy=0x5646c8fcf8e0) at ../hw/virtio/virtio-pci.c:413
#9 virtio_write_config (pci_dev=0x5646c8fcf8e0, address=4, val=<optimized out>, len=<optimized out>) at ../hw/virtio/virtio-pci.c:803
#10 0x0000564689dcb45a in memory_region_write_accessor (mr=mr@entry=0x5646c6dc2d30, addr=3145732, value=value@entry=0x7fccfcdfb528, size=size@entry=2, shift=<optimized out>, mask=mask@entry=65535, attrs=...) at ../system/memory.c:491
#11 0x0000564689dcaeb0 in access_with_adjusted_size (addr=addr@entry=3145732, value=value@entry=0x7fccfcdfb528, size=size@entry=2, access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x564689dcb3f0 <memory_region_write_accessor>, mr=0x5646c6dc2d30, attrs=...) at ../system/memory.c:567
#12 0x0000564689dcb156 in memory_region_dispatch_write (mr=mr@entry=0x5646c6dc2d30, addr=addr@entry=3145732, data=<optimized out>, op=<optimized out>, attrs=attrs@entry=...) at ../system/memory.c:1554
#13 0x0000564689dd389a in flatview_write_continue_step (attrs=..., attrs@entry=..., buf=buf@entry=0x7fcd05b87028 "", mr_addr=3145732, l=l@entry=0x7fccfcdfb5f0, mr=0x5646c6dc2d30, len=2) at ../system/physmem.c:3266
#14 0x0000564689dd3adb in flatview_write_continue (fv=0x7fcadc0d8930, addr=3761242116, attrs=..., ptr=0xe0300004, len=2, mr_addr=<optimized out>, l=<optimized out>, mr=<optimized out>) at ../system/physmem.c:3296
#15 flatview_write (fv=0x7fcadc0d8930, addr=addr@entry=3761242116, attrs=attrs@entry=..., buf=buf@entry=0x7fcd05b87028, len=len@entry=2) at ../system/physmem.c:3327
#16 0x0000564689dd7191 in address_space_write (as=0x56468b433600 <address_space_memory>, addr=3761242116, attrs=..., buf=0x7fcd05b87028, len=2) at ../system/physmem.c:3447
#17 address_space_rw (as=0x56468b433600 <address_space_memory>, addr=3761242116, attrs=attrs@entry=..., buf=buf@entry=0x7fcd05b87028, len=2, is_write=<optimized out>) at ../system/physmem.c:3457
#18 0x0000564689ff1ef6 in kvm_cpu_exec (cpu=cpu@entry=0x5646c6dab810) at ../accel/kvm/kvm-all.c:3248
#19 0x0000564689ff32f5 in kvm_vcpu_thread_fn (arg=arg@entry=0x5646c6dab810) at ../accel/kvm/kvm-accel-ops.c:53
#20 0x000056468a19225c in qemu_thread_start (args=0x5646c6db6190) at ../util/qemu-thread-posix.c:393
#21 0x00007fcd114c5b68 in start_thread () from target:/lib64/libc.so.6
#22 0x00007fcd115364e4 in clone () from target:/lib64/libc.so.6
Thread 3 (Thread 0x7fcd0503a6c0 (LWP 247825) "IO iothread1"):
#0 0x00007fcd114c2d30 in __lll_lock_wait () from target:/lib64/libc.so.6
#1 0x00007fcd114c8fe2 in pthread_mutex_lock@@GLIBC_2.2.5 () from target:/lib64/libc.so.6
#2 0x000056468a192538 in qemu_mutex_lock_impl (mutex=0x56468b432e60 <bql>, file=0x56468a1e26a5 "../system/physmem.c", line=3198) at ../util/qemu-thread-posix.c:94
#3 0x0000564689dc12e2 in bql_lock_impl (file=file@entry=0x56468a1e26a5 "../system/physmem.c", line=line@entry=3198) at ../system/cpus.c:566
#4 0x0000564689ddc151 in prepare_mmio_access (mr=0x56468b433800 <io_mem_unassigned>) at ../system/physmem.c:3198
#5 address_space_lduw_internal_cached_slow (cache=<optimized out>, addr=2, attrs=..., result=0x0, endian=DEVICE_LITTLE_ENDIAN) at ../system/memory_ldst.c.inc:211
#6 address_space_lduw_le_cached_slow (cache=<optimized out>, addr=addr@entry=2, attrs=attrs@entry=..., result=result@entry=0x0) at ../system/memory_ldst.c.inc:253
#7 0x0000564689fd692c in address_space_lduw_le_cached (result=0x0, cache=<optimized out>, addr=2, attrs=...) at /var/tmp/qemu/include/exec/memory_ldst_cached.h.inc:35
#8 lduw_le_phys_cached (cache=<optimized out>, addr=2) at /var/tmp/qemu/include/exec/memory_ldst_phys.h.inc:66
#9 virtio_lduw_phys_cached (vdev=<optimized out>, cache=<optimized out>, pa=2) at /var/tmp/qemu/include/hw/virtio/virtio-access.h:166
#10 vring_avail_idx (vq=0x5646c8fe2470) at ../hw/virtio/virtio.c:396
#11 virtio_queue_split_set_notification (vq=0x5646c8fe2470, enable=0) at ../hw/virtio/virtio.c:534
#12 virtio_queue_set_notification (vq=0x5646c8fe2470, enable=0) at ../hw/virtio/virtio.c:595
#13 0x000056468a18e7a8 in poll_set_started (ctx=ctx@entry=0x5646c6c74e30, ready_list=ready_list@entry=0x7fcd050366a0, started=started@entry=true) at ../util/aio-posix.c:247
#14 0x000056468a18f2bb in poll_set_started (ctx=0x5646c6c74e30, ready_list=0x7fcd050366a0, started=true) at ../util/aio-posix.c:226
#15 try_poll_mode (ctx=0x5646c6c74e30, ready_list=0x7fcd050366a0, timeout=<synthetic pointer>) at ../util/aio-posix.c:612
#16 aio_poll (ctx=0x5646c6c74e30, blocking=blocking@entry=true) at ../util/aio-posix.c:689
#17 0x000056468a032c26 in iothread_run (opaque=opaque@entry=0x5646c69f3380) at ../iothread.c:63
#18 0x000056468a19225c in qemu_thread_start (args=0x5646c6c75410) at ../util/qemu-thread-posix.c:393
#19 0x00007fcd114c5b68 in start_thread () from target:/lib64/libc.so.6
#20 0x00007fcd115364e4 in clone () from target:/lib64/libc.so.6
Buglink: https://issues.redhat.com/browse/RHEL-71933
Reported-by: Peixiu Hou <phou@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20251029185224.420261-1-stefanha@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Currently, CoCo VMs can perform conversion at the base page granularity,
which is the granularity that has to be tracked. In relevant setups, the
target page size is assumed to be equal to the host page size, thus
fixing the block size to the host page size.
However, since private memory and shared memory have different backend
at present, users can specify shared memory with a hugetlbfs backend
while private memory with guest_memfd backend only supports 4K page
size. In this scenario, ram_block->page_size is different from the host
page size which will trigger an assertion when retrieving the block
size.
To address this, return the host page size directly to relax the
restriction. This changes fixes a regression of using hugetlbfs backend
for shared memory within CoCo VMs, with or without VFIO devices' presence.
Acked-by: David Hildenbrand <david@redhat.com>
Tested-by: Farrah Chen <farrah.chen@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Link: https://lore.kernel.org/r/20251023095526.48365-2-chenyi.qiang@intel.com
[peterx: fix subject, per david]
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
QEMU has a per-thread "bql_locked" variable stored in TLS section, showing
whether the current thread is holding the BQL lock.
It's a pretty handy variable. Function-wise, QEMU have codes trying to
conditionally take bql, relying on the var reflecting the locking status
(e.g. BQL_LOCK_GUARD), or in a GDB debugging session, we could also look at
the variable (in reality, co_tls_bql_locked), to see which thread is
currently holding the bql.
When using that as a debugging facility, sometimes we can observe multiple
threads holding bql at the same time. It's because QEMU's condvar APIs
bypassed the bql_*() API, hence they do not update bql_locked even if they
have released the mutex while waiting.
It can cause confusion if one does "thread apply all p co_tls_bql_locked"
and see multiple threads reporting true.
Fix this by moving the bql status updates into the mutex debug hooks. Now
the variable should always reflect the reality.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250904223158.1276992-1-peterx@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
cpus_kick_thread() is called via cpu_exit() -> qemu_cpu_kick(),
and also via gdb_syscall_handling(). Access the CPUState field
using atomic accesses. See commit 8ac2ca02744 ("accel: use atomic
accesses for exit_request") for rationale.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Message-Id: <20250925025520.71805-3-philmd@linaro.org>