In the precompute_partition_info_sums_ function, instead of selecting
64-bit accumulator when the signal bps is larger than 16, revert to the
original approach based on partition size, but make room for few extra
bits to not overflow with unusual signals where the average residual
magnitude may be larger than bps.
It slightly improves the performance with standard encoding levels and
16-bit files as the 17-bit side channel can still be processed with the
32-bit accumulator and correctly selects the 64-bit accumulator with
very large 16-bit partitions.
This is related to commits 6f7ec60c and 187e596e.
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
Rather than the buffer into format_input_() as a FLAC__byte pointer, pass
it as a pointer to a union of three pointers, one each for for FLAC__byte,
FLAC__int16 and FLAC_int32.
This should have zero measurable performance impact.
Restore a FLAC__ASSERT() to bitmath functions FLAC__bitmath_ilog2 and
FLAC__bitmath_ilog2_wide functions. This prevents the return of an
"undefined" value.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
These were most arising from -Wenum-conversion where an enum of
one type was being assigned to a variable on another.
Originally reported by Lenny Maiorani <lenny@colorado.edu> on the
flac-dev mailing list.
CPU detection used to depend on ASM code. Now CPU features are
also detected when only FLAC__HAS_X86INTRIN is defined.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
The new functions are analogous to FLAC__lpc_restore_signal_asm_ia32_mmx.
FLAC uses them for x86-64 arch and also for ia32 if NASM is not available.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
GCC generates slow ia32 code for FLAC__lpc_restore_signal_wide() and
FLAC__lpc_compute_residual_from_qlp_coefficients_wide() so 24-bit
encoding/decoding is slower for GCC compile than for MSVS or ICC
compile. This patch adds ia32 asm versions of these functions.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Most non-static functions have FLAC__ prefix, but they were missing
from the precompute_partition_info_sums_* functions.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
* Splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c
* Add FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2()
function to lpc_intrin_sse2.c
* Add lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions
(useful for 24-bit en-/decoding)
* Add precompute_partition_info_sums_intrin_sse2() / ...ssse3() and
disables precompute_partition_info_sums_32bit_asm_ia32_().
SSE2 version uses 4 SSE2 instructions instead of 1 SSSE3 instruction
PABSD so it is slightly slower.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Before this patch it was possible to set or get data.ia32.sse3 value
from x86-64 code, etc which is a potential source of errors.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
For the 32 bit x86 ASM functions there were already versions of this
function for lags (N = 4, 8, 12). They require lpc_order less than N.
The best compression preset (flac -8) uses lpc_order up to 12; it
means that during encoding FLAC also uses unaccelerated C function.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
Use Benjamin Stiglitz' MIN macros from gcc 4.3 (according to the
changelog, __COUNTER__ was introduced in this version). Previously,
the macros weren't used on any existing gcc version; the first one
would have been 5.5.
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
The problem was that the function safe_malloc_mul_2op_() was originally
defined as static inline in inclide/share/alloc.h but had to be moved
because GCC was refusing to inline it. Once moved however, static linking
would fail when building the flac executable because the function ended
up beiong linked twice.
This patch adds support for other compilers and systems
including MSVC, Intel C compiler etc..
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
- INCLUDES is deprecated, and CPPFLAGS is an user-defined
variable, use the proper AM_CPPFLAGS instead
- Remove FLAC__INLINE definition, providing proper
replacement for MSVC compilers.
- Detect if we have C99 's lround and provide a replacement
for windows...