Commit Graph

1002 Commits

Author SHA1 Message Date
Erik de Castro Lopo
7963120a0d src/libFLAC/memory.c : Wrap inclusion of <stdint.h> in #ifdef.
Lack of the #ifdef was causing problems on VS2008.
2014-05-11 03:06:49 -07:00
Erik de Castro Lopo
11b004cacf libFLAC/stream_encoder.c : Fix if else wibble.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-04-15 18:36:46 +10:00
Erik de Castro Lopo
619d821b68 Add files missing from commit 93f6109c90. 2014-04-12 07:13:08 +10:00
Erik de Castro Lopo
93f6109c90 Add intrinsics version of two lpc functions.
Functions:
- FLAC__fixed_compute_best_predictor
- FLAC__fixed_compute_best_predictor_wide

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-04-11 06:21:15 +10:00
Erik de Castro Lopo
d456cdd28a Suppress MSVS warnings when compiling for x86-64.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-04-11 06:18:52 +10:00
Erik de Castro Lopo
1a6df83163 Use _M_X64 instead of _WIN64.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-04-11 06:16:37 +10:00
Erik de Castro Lopo
3f5208c300 Fix clang compiler warnings.
These were most arising from -Wenum-conversion where an enum of
one type was being assigned to a variable on another.

Originally reported by Lenny Maiorani <lenny@colorado.edu> on the
flac-dev mailing list.
2014-04-09 18:09:03 +10:00
Erik de Castro Lopo
ac940e4175 libFLAC/cpu.c : Bundle of minor fixes.
Includes:

* Replace 'CALLBACK' with 'WINAPI' because the signature of an unhandled
  exception filter uses 'WINAPI'.
* Improvements to OS SSE testing code.
* Improvements to GCC asm code.
* Comment fixes.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-27 19:30:47 +11:00
Erik de Castro Lopo
006b8356d5 Fix all instances of '#if HAVE_CONFIG_H'.
Should be '#ifdef HAVE_CONFIG_H'.

Closes: https://sourceforge.net/p/flac/bugs/410/
2014-03-24 12:06:49 +11:00
Erik de Castro Lopo
49d9d742e2 metadata_object.c : Fix handling of zero length vorbis comment string.
Previously if a zero length string was passed in, the pointer would be
stored regardless of the copy parameter. If the original source pointer
was reassigned to something else bad things could happen.

Closes:  https://sourceforge.net/p/flac/bugs/377/
2014-03-23 21:41:01 +11:00
Erik de Castro Lopo
697dbdee8f Revert "Attempt to fix differences between x86 FPU and SSE calculations."
This reverts commit 70b078cfd5.

The code in the patch we're reverting probably only works for one
compiler and could easily stop working with the next release of
that compiler.
2014-03-23 19:58:44 +11:00
Erik de Castro Lopo
70b078cfd5 Attempt to fix differences between x86 FPU and SSE calculations.
The x86 FPU holds intermediate results in larger registers than what
the SSE unit uses, resulting in slighlty different encodings of audio
data. Attempt to fix this by modifying libFLAC/lpc.c to store calculation
results in a FLAC__read before adding it to a sum.

At the moment this works, but I could easily imagine a new version of
the compiler optimising this store to the FLAC__real away leaving us
in the same situation we have now.

Patch-from: Oliver Stöneberg on sourceforge.net
Closes: https://sourceforge.net/p/flac/bugs/409/
2014-03-21 19:26:08 +11:00
Erik de Castro Lopo
d7e6d91fba Fix build issue on OSX with GCC 4.2/Xcode.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-21 17:45:21 +11:00
Erik de Castro Lopo
99d5154f43 libFLAC/cpu.c : Detect SSE correctly on Windows when compiling with MinGW.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:40:56 +11:00
Erik de Castro Lopo
47bd9964fa stream_encoder/decoder : Comment fixes.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:33:11 +11:00
Erik de Castro Lopo
f931d13411 libFLAC/format.c : Remove MSVC6 specific hack.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:33:11 +11:00
Erik de Castro Lopo
bdcbce4bfa lpc_asm.nasm : Whitespace.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:33:11 +11:00
Erik de Castro Lopo
d36ef6298b Whitespace.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-14 15:33:11 +11:00
Fengwei Yin
7cbecbae9f Using uintptr_t to simplify pointer handling a little bit
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
2014-03-14 15:32:50 +11:00
Erik de Castro Lopo
15e28a4b94 src/libFLAC/ : CPU feature detection improvements.
CPU detection used to depend on ASM code. Now CPU features are
also detected when only FLAC__HAS_X86INTRIN is defined.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-03-02 21:36:45 +11:00
Erik de Castro Lopo
ace63cc828 stream_encoder.c : ifdef cleanup.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-25 18:38:20 +11:00
Erik de Castro Lopo
b334fb2a5c Fix typos in comments.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-24 21:47:20 +11:00
Erik de Castro Lopo
cf0e42ae6e Don't use intrinsics when they are slower.
More thorough en-/decoding tests show that sometimes the functions
that use intrinsics are slower (or not really faster) than old
plain C functions.

After this patch the encoder doesn't use these new functions
when their usefulness is questionable.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-24 21:46:05 +11:00
Erik de Castro Lopo
71c9555366 bitmath.h : Fixes for FLAC__bitmath_ilog2_wide().
Existing version had a number of problems:
1) it didn't compile with MSVS
2) it returned correct results only when compiles with GNUC
3) it mentioned LGPL which isn't good for a BSD-licensed library

LGPL -> BSD issue documented here:
http://lists.xiph.org/pipermail/flac-dev/2013-September/004356.html

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-02 10:42:20 +11:00
Erik de Castro Lopo
26b9546149 Add sse2 intrinscics code for lpc_restore_signal_...()
The new functions are analogous to FLAC__lpc_restore_signal_asm_ia32_mmx.
FLAC uses them for x86-64 arch and also for ia32 if NASM is not available.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-02 08:55:56 +11:00
Erik de Castro Lopo
d163ef4567 libFLAC/stream_encoder.c : Fall back to intrinsics if NASM is not available.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-02-01 20:34:55 +11:00
Erik de Castro Lopo
59cfca0030 stream_encoder : Remove un-needed conversion from __m128i to FLAC__uint64.
Encoding speed slightly increased (1...2% for FLAC -8).

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-31 20:54:59 +11:00
Erik de Castro Lopo
4618512de2 Add a fast shift for int64 values.
This patch changes the code from:
	(FLAC__int32)(xmm.m128i_i64[0] >> lp_quantization)
into:
	_mm_cvtsi128_si32(_mm_srli_epi64(xmm, lp_quantization));

Encoding of 24-bit .wav files with 32-bit FLAC became noticeably faster.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-31 20:36:23 +11:00
Erik de Castro Lopo
a03999f570 lpc_intrin_sse2.c : Add RESIDUAL16_RESULT macro.
RESIDUAL16_RESULT is analogous to the existing RESIDUAL_RESULT macro
and simplifies the code a little.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-30 22:17:08 +11:00
Erik de Castro Lopo
1d920993f1 Remove redundant inline macro def.
The inline macro already exists in include/share/compat.h.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-30 21:57:21 +11:00
Erik de Castro Lopo
57297eea26 Add __INTEL_COMPILER to _MSC_VER #ifdefs.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-30 21:53:41 +11:00
Erik de Castro Lopo
d40e986a1e Add FLAC__SSE_SUPPORTED and FLAC__SSE2_SUPPORTED flags.
* Allow compiling using GCC GCC w/o SSE support.
* Allow SSE4.1 intrinsic functions to be enabled.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-30 21:49:55 +11:00
Erik de Castro Lopo
c2747bec1c lpc_asm.nasm : More 'mov cl' -> 'mov ecx' fixes.
According to Agner Fog in optimizing_assembly.pdf:

  "... write to a partial register may result in false dependencies
   between instructions, so it is better to avoid it."

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-18 07:55:19 +11:00
Erik de Castro Lopo
7e9278934e libFLAC : Add asm versions for two _wide() functions.
GCC generates slow ia32 code for FLAC__lpc_restore_signal_wide() and
FLAC__lpc_compute_residual_from_qlp_coefficients_wide() so 24-bit
encoding/decoding is slower for GCC compile than for MSVS or ICC
compile. This patch adds ia32 asm versions of these functions.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-07 21:35:08 +11:00
Erik de Castro Lopo
8e4a45ac86 libFLAC/ia32/lpc_asm.nasm : Match calls and returns.
According to Agner Fog, "...you must make sure that all calls
are matched with returns. Never jump out of a subroutine without
a return and never use a return as an indirect jump."

(see paragraph 3.15 in microarchitecture.pdf and
examples 3.5a and 3.5b in optimizing_assembly.pdf)

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-07 21:27:09 +11:00
Erik de Castro Lopo
6cd8b42438 Add FLAC__ prefix to precompute_partition_info_sums....
Most non-static functions have FLAC__ prefix, but they were missing
from the precompute_partition_info_sums_* functions.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2014-01-07 21:27:00 +11:00
Gustavo Zacarias
d65ede3e87 Fix Makefile.am altivec logic
Besides SPE (FSL e500v? cores) there are other powerpc processors
that don't support altivec instructions so only enable them when it's
100% sure that the target has it.

Signed-off-by: Gustavo Zacarias <gustavo@zacarias.com.ar>
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
2013-12-20 05:57:33 +11:00
Erik de Castro Lopo
64f34e6e99 libFLAC/stream_encoder.c : Fix MSVS profiler hot spot.
Patch-from: vqcl <lvqcl.mail@gmail.com>
2013-10-10 21:32:07 +11:00
Erik de Castro Lopo
cf28c0144b Adds use of restrict keyword to improve encoding speed.
Restrict works very poorly in Visual Studio (much slower than without)
so defined flac_restrict in share/compat.h and use that in:

    lpc_compute_residual...()
    lpc_restore_signal...()

As a result, FLAC__lpc_compute_residual_from_qlp_coefficients_wide_intrin_sse41()
offers no advantage for 64-bit compiles and was removed from x86-64 part
of stream_encoder.c

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2013-10-10 18:24:19 +11:00
Erik de Castro Lopo
a1abfa3df2 Vcproj file updates.
rplaces
     OutputDirectory="..\..\..\..\objs\debug\bin"
with
     OutputDirectory="$(SolutionDir)objs\$(ConfigurationName)\bin
and so on.

Rmoves
     OutputFile="..\..\objs\debug\lib\$(ProjectName).lib
when possible.

Also, in the current version "Whole program optimization" compiler option
is set, but the corresponding linker option isn't. From MSDN:
   "If you do not explicitly specify /LTCG when you pass /GL or MSIL modules
   to the linker, the linker eventually detects this and restarts the link
   by using /LTCG. Explicitly specify /LTCG when you pass /GL and MSIL modules
   to the linker for the fastest possible build performance."
So /LTCG option was added too.

Debug build now uses libogg_static.lib from .\objs\debug\lib folder.
(the dependency for both release and debug is
    objs\$(ConfigurationName)\lib\libogg_static.lib)

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2013-10-04 13:50:01 +10:00
Erik de Castro Lopo
ecd0acba75 Improve x86 instrinsic implementation.
* Splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c
* Add FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2()
  function to lpc_intrin_sse2.c
* Add lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions
  (useful for 24-bit en-/decoding)
* Add precompute_partition_info_sums_intrin_sse2() / ...ssse3() and
  disables precompute_partition_info_sums_32bit_asm_ia32_().
  SSE2 version uses 4 SSE2 instructions instead of 1 SSSE3 instruction
  PABSD so it is slightly slower.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2013-10-04 01:41:48 +10:00
Erik de Castro Lopo
bd6a920e40 Add FLAC__HAS_X86INTRIN to vcproj files.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2013-09-27 03:10:37 +10:00
Erik de Castro Lopo
31a79d7e9a Move M_PI definition to include/share/compat.h.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2013-09-27 03:05:06 +10:00
Erik de Castro Lopo
4a78cd4e4c Remove union data from FLAC__CPUInfo.
Before this patch it was possible to set or get data.ia32.sse3 value
from x86-64 code, etc which is a potential source of errors.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2013-09-25 23:07:46 +10:00
Erik de Castro Lopo
8fe2c23e31 Add SSE4.1/SSE4.2 detection.
Patch-from: lvqcl <lvqcl.mail@gmail.com>
2013-09-25 23:05:17 +10:00
Erik de Castro Lopo
ae4d720417 Fix/re-enable SSE/SSE2 lpc optimisations. 2013-09-17 06:14:50 +10:00
Erik de Castro Lopo
bd9770ffd1 Only allow SSE2 intrinsics for x86_64. 2013-09-15 19:37:53 +10:00
Erik de Castro Lopo
0752740d8d src/libFLAC/lpc.c : Fix compiler warning. 2013-09-15 10:29:19 +10:00
Erik de Castro Lopo
e07bd181b1 lpc_x86intrin.c : Tweaks.
Include <config.h> before trying to use values defined in that file.
Fix compiler warnings.
2013-09-15 10:29:19 +10:00
Erik de Castro Lopo
5e5ee2720c Adds SSE-accelerated lpc functions.
New functions are:
    FLAC__lpc_compute_autocorrelation_intrin_sse_lag_4()
    FLAC__lpc_compute_autocorrelation_intrin_sse_lag_8()
    FLAC__lpc_compute_autocorrelation_intrin_sse_lag_12()
    FLAC__lpc_compute_autocorrelation_intrin_sse_lag_16()
    FLAC__lpc_compute_residual_from_qlp_coefficients_16_intrin_sse2()

Patch-from: lvqcl <lvqcl.mail@gmail.com>
2013-09-15 10:29:19 +10:00