These were most arising from -Wenum-conversion where an enum of
one type was being assigned to a variable on another.
Originally reported by Lenny Maiorani <lenny@colorado.edu> on the
flac-dev mailing list.
Includes:
* Replace 'CALLBACK' with 'WINAPI' because the signature of an unhandled
exception filter uses 'WINAPI'.
* Improvements to OS SSE testing code.
* Improvements to GCC asm code.
* Comment fixes.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Previously if a zero length string was passed in, the pointer would be
stored regardless of the copy parameter. If the original source pointer
was reassigned to something else bad things could happen.
Closes: https://sourceforge.net/p/flac/bugs/377/
This reverts commit 70b078cfd5.
The code in the patch we're reverting probably only works for one
compiler and could easily stop working with the next release of
that compiler.
The x86 FPU holds intermediate results in larger registers than what
the SSE unit uses, resulting in slighlty different encodings of audio
data. Attempt to fix this by modifying libFLAC/lpc.c to store calculation
results in a FLAC__read before adding it to a sum.
At the moment this works, but I could easily imagine a new version of
the compiler optimising this store to the FLAC__real away leaving us
in the same situation we have now.
Patch-from: Oliver Stöneberg on sourceforge.net
Closes: https://sourceforge.net/p/flac/bugs/409/
CPU detection used to depend on ASM code. Now CPU features are
also detected when only FLAC__HAS_X86INTRIN is defined.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
More thorough en-/decoding tests show that sometimes the functions
that use intrinsics are slower (or not really faster) than old
plain C functions.
After this patch the encoder doesn't use these new functions
when their usefulness is questionable.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
The new functions are analogous to FLAC__lpc_restore_signal_asm_ia32_mmx.
FLAC uses them for x86-64 arch and also for ia32 if NASM is not available.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
According to Agner Fog in optimizing_assembly.pdf:
"... write to a partial register may result in false dependencies
between instructions, so it is better to avoid it."
Patch-from: lvqcl <lvqcl.mail@gmail.com>
GCC generates slow ia32 code for FLAC__lpc_restore_signal_wide() and
FLAC__lpc_compute_residual_from_qlp_coefficients_wide() so 24-bit
encoding/decoding is slower for GCC compile than for MSVS or ICC
compile. This patch adds ia32 asm versions of these functions.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
According to Agner Fog, "...you must make sure that all calls
are matched with returns. Never jump out of a subroutine without
a return and never use a return as an indirect jump."
(see paragraph 3.15 in microarchitecture.pdf and
examples 3.5a and 3.5b in optimizing_assembly.pdf)
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Most non-static functions have FLAC__ prefix, but they were missing
from the precompute_partition_info_sums_* functions.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Besides SPE (FSL e500v? cores) there are other powerpc processors
that don't support altivec instructions so only enable them when it's
100% sure that the target has it.
Signed-off-by: Gustavo Zacarias <gustavo@zacarias.com.ar>
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
Restrict works very poorly in Visual Studio (much slower than without)
so defined flac_restrict in share/compat.h and use that in:
lpc_compute_residual...()
lpc_restore_signal...()
As a result, FLAC__lpc_compute_residual_from_qlp_coefficients_wide_intrin_sse41()
offers no advantage for 64-bit compiles and was removed from x86-64 part
of stream_encoder.c
Patch-from: lvqcl <lvqcl.mail@gmail.com>
rplaces
OutputDirectory="..\..\..\..\objs\debug\bin"
with
OutputDirectory="$(SolutionDir)objs\$(ConfigurationName)\bin
and so on.
Rmoves
OutputFile="..\..\objs\debug\lib\$(ProjectName).lib
when possible.
Also, in the current version "Whole program optimization" compiler option
is set, but the corresponding linker option isn't. From MSDN:
"If you do not explicitly specify /LTCG when you pass /GL or MSIL modules
to the linker, the linker eventually detects this and restarts the link
by using /LTCG. Explicitly specify /LTCG when you pass /GL and MSIL modules
to the linker for the fastest possible build performance."
So /LTCG option was added too.
Debug build now uses libogg_static.lib from .\objs\debug\lib folder.
(the dependency for both release and debug is
objs\$(ConfigurationName)\lib\libogg_static.lib)
Patch-from: lvqcl <lvqcl.mail@gmail.com>
* Splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c
* Add FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2()
function to lpc_intrin_sse2.c
* Add lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions
(useful for 24-bit en-/decoding)
* Add precompute_partition_info_sums_intrin_sse2() / ...ssse3() and
disables precompute_partition_info_sums_32bit_asm_ia32_().
SSE2 version uses 4 SSE2 instructions instead of 1 SSSE3 instruction
PABSD so it is slightly slower.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Before this patch it was possible to set or get data.ia32.sse3 value
from x86-64 code, etc which is a potential source of errors.
Patch-from: lvqcl <lvqcl.mail@gmail.com>