Improve x86 instrinsic implementation.

* Splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c
* Add FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2()
  function to lpc_intrin_sse2.c
* Add lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions
  (useful for 24-bit en-/decoding)
* Add precompute_partition_info_sums_intrin_sse2() / ...ssse3() and
  disables precompute_partition_info_sums_32bit_asm_ia32_().
  SSE2 version uses 4 SSE2 instructions instead of 1 SSSE3 instruction
  PABSD so it is slightly slower.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
This commit is contained in:
Erik de Castro Lopo
2013-10-04 01:38:00 +10:00
parent bd6a920e40
commit ecd0acba75
16 changed files with 3186 additions and 576 deletions

View File

@@ -125,13 +125,17 @@ libFLAC_sources = \
float.c \
format.c \
lpc.c \
lpc_x86intrin.c \
lpc_intrin_sse.c \
lpc_intrin_sse2.c \
lpc_intrin_sse41.c \
md5.c \
memory.c \
metadata_iterators.c \
metadata_object.c \
stream_decoder.c \
stream_encoder.c \
stream_encoder_intrin_sse2.c \
stream_encoder_intrin_ssse3.c \
stream_encoder_framing.c \
window.c \
$(extra_ogg_sources)