These functions wrap ioctl(). When ioctl() fails, it sets @errno.
The wrappers then return that @errno negated.
Except they call accel_ioctl_end() between calling ioctl() and reading
@errno. accel_ioctl_end() can clobber @errno, e.g. when a futex()
system call fails. Seems unlikely, but it's a bug all the same.
Fix by retrieving @errno before calling accel_ioctl_end().
Fixes: a27dd2de68 (KVM: keep track of running ioctls)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20251128152050.3417834-1-armbru@redhat.com>
(cherry picked from commit 88be119fb1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
trans_BRA does
gen_a64_set_pc(s, dst);
set_btype_for_br(s, a->rn);
gen_a64_set_pc does
s->pc_save = -1;
set_btype_for_br (if aa64_bti is enabled and the register is not x16 or
x17) does
gen_pc_plus_diff(s, pc, 0);
gen_pc_plus_diff does
assert(s->pc_save != -1);
Hence, this assert is getting hit. We need to call set_btype_for_br
before gen_a64_set_pc, and there is nothing in set_btype_for_br that
depends on gen_a64_set_pc having already been called, so this commit
simply swaps the calls.
(The commit message for 64678fc45d says that set_brtype_for_br()
must be "moved after" get_a64_set_pc(), but this is a mistake in
the commit message -- the actual changes in that commit move
set_brtype_for_br() *before* get_a64_set_pc() and this is necessary
to avoid the assert.)
Cc: qemu-stable@nongnu.org
Fixes: 64678fc45d ("target/arm: Fix BTI versus CF_PCREL")
Signed-off-by: Harald van Dijk <hdijk@accesssoftek.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: d2265ebb-84bc-41b7-a2d7-05dc9a5a2055@accesssoftek.com
[PMM: added note about 64678fc45d to commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 7248dab3c9)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When the XDMA, RTC and SDHCI device models of the Aspeed SoCs were
first introduced, their MMIO regions inherited of a DEVICE_NATIVE_ENDIAN
endianness. It should be DEVICE_LITTLE_ENDIAN. Fix that.
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20251125142631.676689-1-clg@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 57756aa01f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
msix_init() and msix_init_exclusive_bar() take an "unsigned short"
argument for the number of MSI-X vectors to try to use. This is big
enough for the maximum permitted number of vectors, which is 2048.
Unfortunately, we have several devices (most notably virtio) which
allow the user to specify the desired number of vectors, and which
use uint32_t properties for this. If the user sets the property to a
value that is too big for a uint16_t, the value will be truncated
when it is passed to msix_init(), and msix_init() may then return
success if the truncated value is a valid one.
The resulting mismatch between the number of vectors the msix code
thinks the device has and the number of vectors the device itself
thinks it has can cause assertions, such as the one in issue 2631,
where "-device virtio-mouse-pci,vectors=19923041" is interpreted by
msix as "97 vectors" and by the virtio-pci layer as "19923041
vectors"; a guest attempt to access vector 97 thus passes the
virtio-pci bounds checking and hits an essertion in
msix_vector_use().
Avoid this by making msix_init() and its wrapper function
msix_init_exclusive_bar() take the number of vectors as a uint32_t.
The erroneous command line will now produce the warning
qemu-system-i386: -device virtio-mouse-pci,vectors=19923041:
warning: unable to init msix vectors to 19923041
and proceed without crashing. (The virtio device warns and falls
back to not using MSIX, rather than complaining that the option is
not a valid value this is the same as the existing behaviour for
values that are beyond the MSI-X maximum possible value but fit into
a 16-bit integer, like 2049.)
To ensure this doesn't result in potential overflows in calculation
of the BAR size in msix_init_exclusive_bar(), we duplicate the
nentries error-check from msix_init() at the top of
msix_init_exclusive_bar(), so we know nentries is sane before we
start using it.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2631
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20251107131044.1321637-1-peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit ef44cc0a76)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Note that this issue seems already fixed as a consequence of the large
io_uring rework with 047dabef97 ("block/io_uring: use aio_add_sqe()")
in current master, so this is purely for QEMU stable branches.
At the end of ioq_submit(), there is an opportunistic call to
luring_process_completions(). This is the single caller of
luring_process_completions() that doesn't use the
luring_process_completions_and_submit() wrapper.
Other callers use the wrapper, because luring_process_completions()
might require a subsequent call to ioq_submit() after resubmitting a
request. As noted for luring_resubmit():
> Resubmit a request by appending it to submit_queue. The caller must ensure
> that ioq_submit() is called later so that submit_queue requests are started.
So the caller at the end of ioq_submit() violates the contract and can
in fact be problematic if no other requests come in later. In such a
case, the request intended to be resubmitted will never be actually be
submitted via io_uring_submit().
A reproducer exposing this issue is [0], which is based on user
reports from [1]. Another reproducer is iotest 109 with '-i io_uring'.
I had the most success to trigger the issue with [0] when using a
BTRFS RAID 1 storage. With tmpfs, it can take quite a few iterations,
but also triggers eventually on my machine. With iotest 109 with '-i
io_uring' the issue triggers reliably on my ext4 file system.
Have ioq_submit() submit any resubmitted requests after calling
luring_process_completions(). The return value from io_uring_submit()
is checked to be non-negative before the opportunistic processing of
completions and going for the new resubmit logic, to ensure that a
failure of io_uring_submit() is not missed. Also note that the return
value already was not necessarily the total number of submissions,
since the loop might've been iterated more than once even before the
current change.
Only trigger the resubmission logic if it is actually necessary to
avoid changing behavior more than necessary. For example iotest 109
would produce more 'mirror ready' events if always resubmitting after
luring_process_completions() at the end of ioq_submit().
Note iotest 109 still does not pass as is when run with '-i io_uring',
because of two offset values for BLOCK_JOB_COMPLETED events being zero
instead of non-zero as in the expected output. Note that the two
affected test cases are expected failures and still fail, so they just
fail "faster". The test cases are actually not triggering the resubmit
logic, so the reason seems to be different ordering of requests and
completions of the current aio=io_uring implementation versus
aio=threads.
[0]:
> #!/bin/bash -e
> #file=/mnt/btrfs/disk.raw
> file=/tmp/disk.raw
> filesize=256
> readsize=512
> rm -f $file
> truncate -s $filesize $file
> ./qemu-system-x86_64 --trace '*uring*' --qmp stdio \
> --blockdev raw,node-name=node0,file.driver=file,file.cache.direct=off,file.filename=$file,file.aio=io_uring \
> <<EOF
> {"execute": "qmp_capabilities"}
> {"execute": "human-monitor-command", "arguments": { "command-line": "qemu-io node0 \"read 0 $readsize \"" }}
> {"execute": "quit"}
> EOF
[1]: https://forum.proxmox.com/threads/170045/
Cc: qemu-stable@nongnu.org
Signed-off-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When new requests arrive at a BlockBackend that is currently drained,
these requests are queued until the drain section ends.
There is a race window between blk_root_drained_end() waking up a queued
request in an iothread from the main thread and blk_wait_while_drained()
actually being woken up in the iothread and calling blk_inc_in_flight().
If the BlockBackend is drained again during this window, drain won't
wait for this request and it will sneak in when the BlockBackend is
already supposed to be quiesced. This causes assertion failures in
bdrv_drain_all_begin() and can have other unintended consequences.
Fix this by increasing the in_flight counter immediately when scheduling
the request to be resumed so that the next drain will wait for it to
complete.
Cc: qemu-stable@nongnu.org
Reported-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20251119172720.135424-1-kwolf@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Tested-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com>
Reviewed-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8eeaa706ba)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When there is no display device on qemu machine,
and user only access qemu by remote vnc.
At the same time user input `info vnc` by QMP,
the qemu will abort.
To avoid the abort above, I add display device check,
when query vnc info in qmp_query_vnc_servers().
Reviewed-by: Marc-AndréLureau <marcandre.lureau@redhat.com>
Signed-off-by: Alano Song <AlanoSong@163.com>
[ Marc-André - removed useless Error *err ]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20251125131955.7024-1-AlanoSong@163.com>
(cherry picked from commit 4c1646e23f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In fimd_update_memory_section() we attempt ot find and map part of
the RAM MR which backs the framebuffer, based on guest-configurable
size and start address.
If the guest configures framebuffer settings which result in a
zero-sized framebuffer, we hit an assertion(), because
memory_region_find() will return a NULL mem_section.mr.
Explicitly check for the zero-size case and treat this as a
guest error.
Because we now have a code path which can reach error_return without
calling memory_region_find to set w->mem_section, we must NULL out
w->mem_section.mr after the unref of the old MR, so that error_return
does not incorrectly double-unref the old MR.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1407
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20251107143913.1341358-1-peter.maydell@linaro.org
(cherry picked from commit 579be921f5)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For M-profile cores which support TrustZone, there are some memory
areas which are "NS aliases" -- a Secure access to these addresses
really performs an NS access to a different part of the device. We
implement these using MemoryRegionOps read and write functions which
pass the access on with adjusted attributes using
memory_region_dispatch_read() and memory_region_dispatch_write().
Since the MR we are dispatching to is owned by the same device that
owns the NS-alias MR (the TYPE_ARMV7M container object), this trips
the reentrancy-guard that is applied by access_with_adjusted_size().
Mark the NS alias MemoryRegions as disable_reentrancy_guard; this is
safe because v7m_sysreg_ns_read() and v7m_sysreg_ns_write() do not
touch any of the device's state. (Any further reentrancy attempts by
the underlying MR will still be caught.)
Without this fix, an attempt to read from an address like 0xe002e010,
which is a register in the NS systick alias, will fail and provoke
qemu-system-arm: warning: Blocked re-entrant IO on MemoryRegion: v7m_systick at addr: 0x0
We didn't notice this earlier because almost all code accesses
the registers and systick via the non-alias addresses; the NS
aliases are only need for the rarer case of Secure code that needs
to manage the NS timer or system state on behalf of NS code.
Note that although the v7m_systick_ops read and write functions
also call memory_region_dispatch_{read,write}, this MR does not
need to have the reentrancy-guard disabled because the underlying
MR that it forwards to is owned by a different device (the
TYPE_SYSTICK timer device).
Reported via a stackoverflow question:
https://stackoverflow.com/questions/79808107/what-this-error-is-even-about-qemu-system-arm-warning-blocked-re-entrant-io
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20251114155304.2662414-1-peter.maydell@linaro.org
(cherry picked from commit 4a934d284d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There is a copy & paste error, USO6 should be there.
Fixes: 58f8168978 ("qmp: update virtio feature maps, vhost-user-gpio introspection")
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 5fbcbf76a1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Some functional tests are currently not covered by the entries
in MAINTAINERS yet, so scripts/get_maintainers.pl fails to suggest
the right people who should be CC:-ed for related patches.
Add the uncovered tests to the right sections to close this gap.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20250414121520.213665-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 12c6b61530)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Now that all Avocado tests have been converted to or been replaced by
other functional tests, we can delete the remainders of the Avocado
tests from the QEMU source tree.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-16-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 52e9ed6d3a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This file was meant for defining the vocabulary for our testing
efforts, but it did not age well: First, the definitions are not
only about the CI part, but also about testing in general, so most
of the information should rather reside in main.rst instead.
Second, some vocabulary has never been really adopted by the QEMU
project, for example we never really use the word "system testing"
since "system" rather means the system emulator binaries in the
QEMU project (and we also don't do any testing with other components
like libvirt and virt-managers here). It also defines that the qtests
are the "functional" tests in QEMU, which is not really up to date
anymore after the "tests/functional" framework has been introduced
a couple of months ago (FWIW, the qtests could rather be seen as a
mix between unit testing and functional testing).
To solve this problem, move the useful parts of this file into
main.rst and directly into ci.rst, and drop the ones (like "system
testing") that we don't really need anymore.
Message-ID: <20250314085959.1585568-1-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 5748e46415)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since we don't run the Avocado jobs in the CI anymore, rename
these variables to QEMU_JOB_FUNCTIONAL and QEMU_CI_FUNCTIONAL.
Also, there was a mismatch between the documentation and the
implementation of QEMU_CI_AVOCADO_TESTING: While the documentation
said that you had to "Set this variable to have the tests using the
Avocado framework run automatically", you indeed needed to set it
to make the pipelines appear in your dashboard - but they were never
run automatically in forks and had to be triggered manually. Let's
improve this now: No need to hide these pipelines from the users
by default anymore (the functional tests should be stable enough
nowadays), and rather allow the users to run the pipelines auto-
matically with this switch now instead, as was documented.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-15-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit f8c5484417)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This test was using cloudinit and a "dnf install" command in the guest
to exercise the NIC with SMMU enabled. Since we don't have the cloudinit
stuff in the functional framework and we should not rely on having access
to external networks (once our ASSETs have been cached), we rather boot
into the initrd first, manually mount the root disk and then use the
check_http_download() function from the functional framework here instead
for testing whether the network works as expected.
Unfortunately, there seems to be a small race when using the files
from Fedora 33: To enter the initrd shell, we have to send a "return"
once. But it does not seem to work if we send it too early. Using a
sleep(0.2) makes it work reliably for me, but to make it even more
unlikely to trigger this situation, let's better limit the Fedora 33
tests to only run with KVM.
Finally, while we're at it, we also add some lines for testing writes
to the hard disk, as we already do it in the test_intel_iommu test.
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Message-ID: <20250414113031.151105-14-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 5c2bae2155)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This way we can do a full boot in record-replay mode and
should get a similar test coverage compared to the old
replay test from tests/avocado/replay_linux.py.
Since the aarch64 test was the last avocado test in the
tests/avocado/replay_linux.py file, we can remove this
file now completely.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250414113031.151105-13-thuth@redhat.com>
(cherry picked from commit a820caf844)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This way we can do a full boot in record-replay mode and
should get a similar test coverage compared to the old
replay test from tests/avocado/replay_linux.py. Thus remove
the x86 avocado replay_linux test now.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-12-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 7fecdb0acd)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These tests are based on the cloudinit functions from Avocado.
The cloudinit is very, very slow compared to our other tests,
so most of these Avocado tests have either been disabled by default
with a decorator, or have been marked to only run with KVM.
We won't include this sluggish cloudinit stuff in the functional
framework, and we've already got plenty of other tests there that
check pretty much the same things, so let's simply get rid of these
old tests now.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-11-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit e83aee9c6a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reuse the test function from the 32-bit big endian test to easily
convert the 64-bit big endian Wheezy mips test.
Since this was the last test in tests/avocado/linux_ssh_mips_malta.py,
we can remove this avocado file now, too.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-10-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit f79592f427)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The test checks some entries in /proc and the output of some commands ...
we put these checks into exportable functions now so that they can
be reused more easily.
Additionally the linux_ssh_mips_malta.py uses SSH to test the networking
of the guest. Since we don't have a SSH module in the functional
framework yet, let's use the check_http_download() function here instead.
And while we're at it, also switch the NIC to e1000 now to get some more
test coverage, since the "pcnet" device is already tested in the test
test_mips_malta_cpio.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-7-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 42a87f0ce7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These tests are using the gdb-related library functions from the
Avocado framework which we don't have in the functional framework
yet. So for the time being, keep those imports and skip the test
if the Avocado framework is not installed on the host.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-4-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 951ededf12)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
test_x86_64_pc in tests/avocado/boot_linux_console.py only checks
whether the kernel parameters have correctly been passed to the
kernel in the guest by looking for them in the console output of the
guest. Let's move that to the functional test framework now, but
instead of doing it in a separate test, let's do it for all tuxrun
tests instead, so it is done automatically for all targets that have
a tuxrun test.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-3-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit bc65ae6961)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We are going to move the remaining Avocado tests step by step
into the functional test framework. Unfortunately, Avocado fails
with an error if it cannot determine a test to run, so disable
the tests here now to avoid failures in the Gitlab-CI during the
next steps.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20250414113031.151105-2-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 22baa5f340)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The test_change_password test will fail if no cryptographic backend is
available (e.g. if QEMU was built on a system with no cryptographic
library development packages installed); just skip the test in that
case.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20250414093732.220498-1-cohuck@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 4e3823c68c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The stack can be 32-bit even in real mode, and in this case
the stack pointer must be updated in its entirety rather than
just the bottom 16 bits. The same is true of real mode IRET,
for which there was even a comment suggesting the right thing
to do.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1506
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 106d766c9d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The exit_code parameter of cpu_vmexit is declared as uint32_t, but exit
codes are 64 bits wide according to the AMD SVM specification. And because
uint32_t is unsigned, this causes exit codes to be zero-extended, for example
writing SVM_EXIT_ERR as 0xffff_ffff instead of the expected 0xffff_ffff_ffff_ffff.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2977
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9c3afb9d9b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
There are a small set of binary SSE insns which have no MMX
equivalent, which we create the gen functions for with the
BINARY_INT_SSE() macro. This forwards to gen_binary_int_sse() with a
NULL pointer for 'mmx'.
For almost all of these insns we correctly mark them in the decode
table as not permitting a zero prefix byte; however we got this wrong
for VPERMILPS, with the result that a bogus instruction would get
through the decode checks and end up in gen_binary_int_sse() trying
to call a NULL pointer.
Correct the decode table entry for VPERMILPS so that we get the
expected #UD exception.
In the x86 SDM, table A-4 "Three-byte Opcode Map: 08H-FFH
(First Two Bytes are 0F 38H)" confirms that there is no pfx 0
version of VPERMILPS.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/3199
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Link: https://lore.kernel.org/r/20251114175417.2794804-1-peter.maydell@linaro.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ebd9ea2947)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Registers are always 32 bit aligned. R_MAX is not the maximum
register address, it is the maximum register number. The memory
size can be determined by 4 * R_MAX.
Currently every register with an offset bigger than 0x40 will be
ignored, because the memory size is set wrong. This effects the
MCTRL register and makes it useless. This commit restores the
correct behaviour.
Cc: qemu-stable@nongnu.org
Fixes: 034c2e6902 ("dma: Add Xilinx Zynq devcfg device model")
Signed-off-by: YannickV <Y.Vossen@beckhoff.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20251111102836.212535-9-corvin.koehne@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit a344e22917)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
bdrv_co_get_self_request() does not take a lock around iterating through
bs->tracked_requests. With multiqueue, it may thus iterate over a list
that is in the process of being modified, producing an assertion
failure:
../block/file-posix.c:3702: raw_do_pwrite_zeroes: Assertion `req' failed.
[0] abort() at /lib64/libc.so.6
[1] __assert_fail_base.cold() at /lib64/libc.so.6
[2] raw_do_pwrite_zeroes() at ../block/file-posix.c:3702
[3] bdrv_co_do_pwrite_zeroes() at ../block/io.c:1910
[4] bdrv_aligned_pwritev() at ../block/io.c:2109
[5] bdrv_co_do_zero_pwritev() at ../block/io.c:2192
[6] bdrv_co_pwritev_part() at ../block/io.c:2292
[7] bdrv_co_pwritev() at ../block/io.c:2225
[8] handle_alloc_space() at ../block/qcow2.c:2573
[9] qcow2_co_pwritev_task() at ../block/qcow2.c:2625
Fix this by taking reqs_lock.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20251110154854.151484-11-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9b9ee60c07)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
nvme wakes the request coroutine via qemu_coroutine_enter() from a BH
scheduled in the BDS AioContext. This may not be the same context as
the one in which the request originally ran, which would be wrong:
- It could mean we enter the coroutine before it yields,
- We would move the coroutine in to a different context.
(Can be reproduced with multiqueue by adding a usleep(100000) before the
`while (data.ret == -EINPROGRESS)` loop.)
To fix that, use aio_co_wake() to run the coroutine in its home context.
Just like in the preceding iscsi and nfs patches, we can drop the
trivial nvme_rw_cb_bh() and use aio_co_wake() directly.
With this, we can remove NVMeCoData.ctx.
Note the check of data->co == NULL to bypass the BH/yield combination in
case nvme_rw_cb() is called from nvme_submit_command(): We probably want
to keep this fast path for performance reasons, but we have to be quite
careful about it:
- We cannot overload .ret for this, but have to use a dedicated
.skip_yield field. Otherwise, if nvme_rw_cb() runs in a different
thread than the coroutine, it may see .ret set and skip the yield,
while nvme_rw_cb() will still schedule a BH for waking. Therefore,
the signal to skip the yield can only be set in nvme_rw_cb() if waking
too is skipped, which is independent from communicating the return
value.
- We can only skip the yield if nvme_rw_cb() actually runs in the
request coroutine. Otherwise (specifically if they run in different
AioContexts), the order between this function’s execution and the
coroutine yielding (or not yielding) is not reliable.
- There is no point to yielding in a loop; there are no spurious wakes,
so once we yield, we will only be re-entered once the command is done.
Replace `while` by `if`.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20251110154854.151484-9-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 0f142cbd91)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
nvme_process_completion() must run in the main BDS context, so schedule
a BH for requests that aren’t there.
The context in which we kick does not matter, but let’s just keep kick
and process_completion together for simplicity’s sake.
(For what it’s worth, a quick fio bandwidth test indicates that on my
test hardware, if anything, this may be a bit better than kicking
immediately before scheduling a pure nvme_process_completion() BH. But
I wouldn’t take more from those results than that it doesn’t really seem
to matter either way.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20251110154854.151484-8-hreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7a501bbd51)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If we wake a coroutine from a different context, we must ensure that it
will yield exactly once (now or later), awaiting that wake.
curl’s current .ret == -EINPROGRESS loop may lead to the coroutine not
yielding if the request finishes before the loop gets run. To fix it,
we must drop the loop and yield exactly once, if we need to yield.
Finding out that latter part ("if we need to yield") makes it a bit
complicated: Requests may be served from a cache internal to the curl
block driver, or fail before being submitted. In these cases, we must
not yield. However, if we find a matching but still ongoing request in
the cache, we will have to await that, i.e. still yield.
To address this, move the yield inside of the respective functions:
- Inside of curl_find_buf() when awaiting ongoing concurrent requests,
- Inside of curl_setup_preadv() when having created a new request.
Rename curl_setup_preadv() to curl_do_preadv() to reflect this.
(Can be reproduced with multiqueue by adding a usleep(100000) before the
`while (acb.ret == -EINPROGRESS)` loop.)
Also, add a comment why aio_co_wake() is safe regardless of whether the
coroutine and curl_multi_check_completion() run in the same context.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20251110154854.151484-6-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 53d5c7ffac)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Like in “rbd: Run co BH CB in the coroutine’s AioContext”, drop the
completion flag, yield exactly once, and run the BH in the coroutine’s
AioContext.
(Can be reproduced with multiqueue by adding a usleep(100000) before the
`while (!task.complete)` loops.)
Like in “iscsi: Run co BH CB in the coroutine’s AioContext”, this makes
nfs_co_generic_bh_cb() trivial, so we can drop it in favor of just
calling aio_co_wake() directly.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20251110154854.151484-5-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit deb35c129b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
qemu_rbd_completion_cb() schedules the request completion code
(qemu_rbd_finish_bh()) to run in the BDS’s AioContext, assuming that
this is the same thread in which qemu_rbd_start_co() runs.
To explain, this is how both latter functions interact:
In qemu_rbd_start_co():
while (!task.complete)
qemu_coroutine_yield();
In qemu_rbd_finish_bh():
task->complete = true;
aio_co_wake(task->co); // task->co is qemu_rbd_start_co()
For this interaction to work reliably, both must run in the same thread
so that qemu_rbd_finish_bh() can only run once the coroutine yields.
Otherwise, finish_bh() may run before start_co() checks task.complete,
which will result in the latter seeing .complete as true immediately and
skipping the yield altogether, even though finish_bh() still wakes it.
With multiqueue, the BDS’s AioContext is not necessarily the thread
start_co() runs in, and so finish_bh() may be scheduled to run in a
different thread than start_co(). With the right timing, this will
cause the problems described above; waking a non-yielding coroutine is
not good, as can be reproduced by putting e.g. a usleep(100000) above
the while loop in start_co() (and using multiqueue), giving finish_bh()
a much better chance at exiting before start_co() can yield.
So instead of scheduling finish_bh() in the BDS’s AioContext, schedule
finish_bh() in task->co’s AioContext.
In addition, we can get rid of task.complete altogether because we will
get woken exactly once, when the task is indeed complete, no need to
check.
(We could go further and drop the BH, running aio_co_wake() directly in
qemu_rbd_completion_cb() because we are allowed to do that even if the
coroutine isn’t yet yielding and we’re in a different thread – but the
doc comment on qemu_rbd_completion_cb() says to be careful, so I decided
not to go so far here.)
Buglink: https://issues.redhat.com/browse/RHEL-67115
Reported-by: Junyao Zhao <junzhao@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-ID: <20251110154854.151484-3-hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 89d22536d1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>