Commit Graph

1017 Commits

Author SHA1 Message Date
Erik de Castro Lopo
c43691586a libFLAC : Remove FLAC__precompute_partition_info_sums_32bit_asm_ia32_().
This function offer no speed up from the C version of the function and were
commented out after the release of 1.3.0. We will now drop them completely.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-07-06 20:21:04 +10:00
Miroslav Lichvar
f081524c19 stream_encoder : Improve selection of residual accumulator width
In the precompute_partition_info_sums_ function, instead of selecting
64-bit accumulator when the signal bps is larger than 16, revert to the
original approach based on partition size, but make room for few extra
bits to not overflow with unusual signals where the average residual
magnitude may be larger than bps.

It slightly improves the performance with standard encoding levels and
16-bit files as the 17-bit side channel can still be processed with the
32-bit accumulator and correctly selects the 64-bit accumulator with
very large 16-bit partitions.

This is related to commits 6f7ec60c and 187e596e.

Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
2014-07-04 21:22:44 +10:00
Erik de Castro Lopo
5e8854fa84 src/ : Remove un-needed MSVC6 workaround.
MSVC6 was not able to cast from a uint64_t to a double and this
commit removes some #ifdef hackery designed to work around this
problem.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-07-04 08:32:01 +10:00
Erik de Castro Lopo
7590d99b29 libFLAC/md5: Fix for cast-align warnings on ARM.
Rather than the buffer into format_input_() as a FLAC__byte pointer, pass
it as a pointer to a union of three pointers, one each for for FLAC__byte,
FLAC__int16 and FLAC_int32.
This should have zero measurable performance impact.
2014-06-29 21:53:01 +10:00
Erik de Castro Lopo
51c6567f62 libFLAC/md5.c : Massive refactoring of format_input_ function.
This refactoring is in preparation for fixing the cast-align warning when
compiling on ARM (and possibly others). Testing on stereo 16 bit files
suggests that the difference between the performance of this code and the
old code is negligible (tested only on amd64/linux).
2014-06-29 21:07:07 +10:00
Erik de Castro Lopo
a8567d4b4e libFLAC/bitmath : Restore an ASSERT that was removed some time after 1.2.1.
Restore a FLAC__ASSERT() to bitmath functions FLAC__bitmath_ilog2 and
FLAC__bitmath_ilog2_wide functions. This prevents the return of an
"undefined" value.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-06-28 22:25:18 +10:00
Erik de Castro Lopo
9aa1546436 libFLAC/lpc_intrin_sseN.c : Disambiguate macro names.
Previously, the files lpc_intrin_sse2.c and lpc_intrin_sse41.c both defined
macros RESIDUAL_RESULT and DATA_RESULT. This situation made it impossible
to merge these files which we may do at some stage.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-06-28 22:25:18 +10:00
Erik de Castro Lopo
2c15052550 libFLAC: CPUID detecion improvements.
According to docs, it's incorrect to just call CPUID with EAX=1.
One must to ensure that this value is supported.

CPUs that don't support CPUID level 1 are very old, but...
if FLAC tests CPUID presence it should also test CPUID level support.

Also the function FLAC__cpu_have_cpuid_asm_ia32 was simplified
according to the docs at Intel website and in Wikipedia.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-06-28 22:25:18 +10:00
Erik de Castro Lopo
6ccef14727 stream_decoder : Two read_metadata() fixes from 1.2.1 maintenance branch.
* Fix leaks in read_metadata_() that could occur because of read errors or
  malformed streams.
    http://flac.cvs.sourceforge.net/viewvc/flac/flac/src/libFLAC/
    stream_decoder.c?
    r1=1.147&r2=1.147.2.1&pathrev=FLAC_RELEASE_1_2_1_MAINTENANCE_BRANCH

* Fix metadata block initialization bug in read_metadata_().
    http://flac.cvs.sourceforge.net/viewvc/flac/flac/src/libFLAC/
    stream_decoder.c?
    r1=1.147.2.1&r2=1.147.2.2&pathrev=FLAC_RELEASE_1_2_1_MAINTENANCE_BRANCH

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-06-28 09:27:33 +10:00
Erik de Castro Lopo
46bedb58d3 Update URLs as nedeed.
Sourceforge.net links updated as nedeed with some of them
being changed to point to xiph.org/flac.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-06-28 09:12:45 +10:00
Erik de Castro Lopo
987f74ae7a Correections for comments.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-06-28 09:10:32 +10:00
Evan Ramos
9df6736ec0 Update Makefile.lite build system.
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
2014-06-24 21:02:24 +10:00
Erik de Castro Lopo
b8d58e327c Revert "Replace FLAC__CPU_X86_64 with FLaC__CPU_X86_64."
This reverts commit 151739921b.

This patch only when part way to replace all FLAC_* with FLaC_*
and its really not worth going all the way.
2014-06-15 20:29:34 +10:00
Erik de Castro Lopo
151739921b Replace FLAC__CPU_X86_64 with FLaC__CPU_X86_64.
Previous autorconf versions had problems with variable begining witj
'FLAC_' (autoconf uses 'AC_').

Reported-by: lvqcl <lvqcl.mail@gmail.com>
2014-06-01 17:33:54 +10:00
Erik de Castro Lopo
3d394924ff libFLAC/ : Refactoring and add comments.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-05-14 21:28:55 +10:00
Erik de Castro Lopo
7963120a0d src/libFLAC/memory.c : Wrap inclusion of <stdint.h> in #ifdef.
Lack of the #ifdef was causing problems on VS2008.
2014-05-11 03:06:49 -07:00
Erik de Castro Lopo
11b004cacf libFLAC/stream_encoder.c : Fix if else wibble.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-04-15 18:36:46 +10:00
Erik de Castro Lopo
619d821b68 Add files missing from commit 93f6109c90. 2014-04-12 07:13:08 +10:00
Erik de Castro Lopo
93f6109c90 Add intrinsics version of two lpc functions.
Functions:
- FLAC__fixed_compute_best_predictor
- FLAC__fixed_compute_best_predictor_wide

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-04-11 06:21:15 +10:00
Erik de Castro Lopo
d456cdd28a Suppress MSVS warnings when compiling for x86-64.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-04-11 06:18:52 +10:00
Erik de Castro Lopo
1a6df83163 Use _M_X64 instead of _WIN64.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-04-11 06:16:37 +10:00
Erik de Castro Lopo
3f5208c300 Fix clang compiler warnings.
These were most arising from -Wenum-conversion where an enum of
one type was being assigned to a variable on another.

Originally reported by Lenny Maiorani <lenny@colorado.edu> on the
flac-dev mailing list.
2014-04-09 18:09:03 +10:00
Erik de Castro Lopo
ac940e4175 libFLAC/cpu.c : Bundle of minor fixes.
Includes:

* Replace 'CALLBACK' with 'WINAPI' because the signature of an unhandled
  exception filter uses 'WINAPI'.
* Improvements to OS SSE testing code.
* Improvements to GCC asm code.
* Comment fixes.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-27 19:30:47 +11:00
Erik de Castro Lopo
006b8356d5 Fix all instances of '#if HAVE_CONFIG_H'.
Should be '#ifdef HAVE_CONFIG_H'.

Closes: https://sourceforge.net/p/flac/bugs/410/
2014-03-24 12:06:49 +11:00
Erik de Castro Lopo
49d9d742e2 metadata_object.c : Fix handling of zero length vorbis comment string.
Previously if a zero length string was passed in, the pointer would be
stored regardless of the copy parameter. If the original source pointer
was reassigned to something else bad things could happen.

Closes:  https://sourceforge.net/p/flac/bugs/377/
2014-03-23 21:41:01 +11:00
Erik de Castro Lopo
697dbdee8f Revert "Attempt to fix differences between x86 FPU and SSE calculations."
This reverts commit 70b078cfd5.

The code in the patch we're reverting probably only works for one
compiler and could easily stop working with the next release of
that compiler.
2014-03-23 19:58:44 +11:00
Erik de Castro Lopo
70b078cfd5 Attempt to fix differences between x86 FPU and SSE calculations.
The x86 FPU holds intermediate results in larger registers than what
the SSE unit uses, resulting in slighlty different encodings of audio
data. Attempt to fix this by modifying libFLAC/lpc.c to store calculation
results in a FLAC__read before adding it to a sum.

At the moment this works, but I could easily imagine a new version of
the compiler optimising this store to the FLAC__real away leaving us
in the same situation we have now.

Patch-from: Oliver Stöneberg on sourceforge.net
Closes: https://sourceforge.net/p/flac/bugs/409/
2014-03-21 19:26:08 +11:00
Erik de Castro Lopo
d7e6d91fba Fix build issue on OSX with GCC 4.2/Xcode.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-21 17:45:21 +11:00
Erik de Castro Lopo
99d5154f43 libFLAC/cpu.c : Detect SSE correctly on Windows when compiling with MinGW.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:40:56 +11:00
Erik de Castro Lopo
47bd9964fa stream_encoder/decoder : Comment fixes.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:33:11 +11:00
Erik de Castro Lopo
f931d13411 libFLAC/format.c : Remove MSVC6 specific hack.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:33:11 +11:00
Erik de Castro Lopo
bdcbce4bfa lpc_asm.nasm : Whitespace.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:33:11 +11:00
Erik de Castro Lopo
d36ef6298b Whitespace.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:33:11 +11:00
Fengwei Yin
7cbecbae9f Using uintptr_t to simplify pointer handling a little bit
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
2014-03-14 15:32:50 +11:00
Erik de Castro Lopo
15e28a4b94 src/libFLAC/ : CPU feature detection improvements.
CPU detection used to depend on ASM code. Now CPU features are
also detected when only FLAC__HAS_X86INTRIN is defined.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-02 21:36:45 +11:00
Erik de Castro Lopo
ace63cc828 stream_encoder.c : ifdef cleanup.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-25 18:38:20 +11:00
Erik de Castro Lopo
b334fb2a5c Fix typos in comments.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-24 21:47:20 +11:00
Erik de Castro Lopo
cf0e42ae6e Don't use intrinsics when they are slower.
More thorough en-/decoding tests show that sometimes the functions
that use intrinsics are slower (or not really faster) than old
plain C functions.

After this patch the encoder doesn't use these new functions
when their usefulness is questionable.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-24 21:46:05 +11:00
Erik de Castro Lopo
71c9555366 bitmath.h : Fixes for FLAC__bitmath_ilog2_wide().
Existing version had a number of problems:
1) it didn't compile with MSVS
2) it returned correct results only when compiles with GNUC
3) it mentioned LGPL which isn't good for a BSD-licensed library

LGPL -> BSD issue documented here:
http://lists.xiph.org/pipermail/flac-dev/2013-September/004356.html

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-02 10:42:20 +11:00
Erik de Castro Lopo
26b9546149 Add sse2 intrinscics code for lpc_restore_signal_...()
The new functions are analogous to FLAC__lpc_restore_signal_asm_ia32_mmx.
FLAC uses them for x86-64 arch and also for ia32 if NASM is not available.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-02 08:55:56 +11:00
Erik de Castro Lopo
d163ef4567 libFLAC/stream_encoder.c : Fall back to intrinsics if NASM is not available.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-01 20:34:55 +11:00
Erik de Castro Lopo
59cfca0030 stream_encoder : Remove un-needed conversion from __m128i to FLAC__uint64.
Encoding speed slightly increased (1...2% for FLAC -8).

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-31 20:54:59 +11:00
Erik de Castro Lopo
4618512de2 Add a fast shift for int64 values.
This patch changes the code from:
	(FLAC__int32)(xmm.m128i_i64[0] >> lp_quantization)
into:
	_mm_cvtsi128_si32(_mm_srli_epi64(xmm, lp_quantization));

Encoding of 24-bit .wav files with 32-bit FLAC became noticeably faster.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-31 20:36:23 +11:00
Erik de Castro Lopo
a03999f570 lpc_intrin_sse2.c : Add RESIDUAL16_RESULT macro.
RESIDUAL16_RESULT is analogous to the existing RESIDUAL_RESULT macro
and simplifies the code a little.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-30 22:17:08 +11:00
Erik de Castro Lopo
1d920993f1 Remove redundant inline macro def.
The inline macro already exists in include/share/compat.h.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-30 21:57:21 +11:00
Erik de Castro Lopo
57297eea26 Add __INTEL_COMPILER to _MSC_VER #ifdefs.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-30 21:53:41 +11:00
Erik de Castro Lopo
d40e986a1e Add FLAC__SSE_SUPPORTED and FLAC__SSE2_SUPPORTED flags.
* Allow compiling using GCC GCC w/o SSE support.
* Allow SSE4.1 intrinsic functions to be enabled.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-30 21:49:55 +11:00
Erik de Castro Lopo
c2747bec1c lpc_asm.nasm : More 'mov cl' -> 'mov ecx' fixes.
According to Agner Fog in optimizing_assembly.pdf:

  "... write to a partial register may result in false dependencies
   between instructions, so it is better to avoid it."

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-18 07:55:19 +11:00
Erik de Castro Lopo
7e9278934e libFLAC : Add asm versions for two _wide() functions.
GCC generates slow ia32 code for FLAC__lpc_restore_signal_wide() and
FLAC__lpc_compute_residual_from_qlp_coefficients_wide() so 24-bit
encoding/decoding is slower for GCC compile than for MSVS or ICC
compile. This patch adds ia32 asm versions of these functions.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-07 21:35:08 +11:00
Erik de Castro Lopo
8e4a45ac86 libFLAC/ia32/lpc_asm.nasm : Match calls and returns.
According to Agner Fog, "...you must make sure that all calls
are matched with returns. Never jump out of a subroutine without
a return and never use a return as an indirect jump."

(see paragraph 3.15 in microarchitecture.pdf and
examples 3.5a and 3.5b in optimizing_assembly.pdf)

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-07 21:27:09 +11:00