Improve x86 instrinsic implementation.

* Splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c
* Add FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2()
  function to lpc_intrin_sse2.c
* Add lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions
  (useful for 24-bit en-/decoding)
* Add precompute_partition_info_sums_intrin_sse2() / ...ssse3() and
  disables precompute_partition_info_sums_32bit_asm_ia32_().
  SSE2 version uses 4 SSE2 instructions instead of 1 SSSE3 instruction
  PABSD so it is slightly slower.

Patch-from: lvqcl <lvqcl.mail@gmail.com>
This commit is contained in:
Erik de Castro Lopo
2013-10-04 01:38:00 +10:00
parent bd6a920e40
commit ecd0acba75
16 changed files with 3186 additions and 576 deletions

View File

@@ -195,4 +195,12 @@ int flac_snprintf(char *str, size_t size, const char *fmt, ...);
};
#endif
/* SSSE3, SSE4 support: MSVS 2008, GCC 4.3 -- currently disabled, Intel Compiler 10.0 */
#if ( defined _MSC_VER && _MSC_VER >= 1500 ) \
|| ( 0 && defined __GNUC__ && (__GNUC__ > 4 || (__GNUC__ == 4 && __GNUC_MINOR__ >= 3)) ) \
|| ( defined __INTEL_COMPILER && __INTEL_COMPILER >= 1000 )
#define FLAC__SSSE3_SUPPORTED 1
#define FLAC__SSE4_SUPPORTED 1
#endif
#endif /* FLAC__SHARE__COMPAT_H */