Compare commits

...

783 Commits

Author SHA1 Message Date
Evgenii Kliuchnikov
8ca2312c61 fix release workflow
PiperOrigin-RevId: 822073417
2025-10-21 05:37:07 -07:00
Evgenii Kliuchnikov
ee771daf20 fix copy-paste in Java decoder
PiperOrigin-RevId: 822024938
2025-10-21 02:42:59 -07:00
Evgenii Kliuchnikov
42aee32891 try to fix release workflow
PiperOrigin-RevId: 822012047
2025-10-21 02:03:38 -07:00
Evgenii Kliuchnikov
392c06bac0 redesign release resource uploading
PiperOrigin-RevId: 821982935
2025-10-21 00:22:30 -07:00
Evgenii Kliuchnikov
1964cdb1b9 ramp up all GH actions plugins
PiperOrigin-RevId: 821598646
2025-10-20 05:07:13 -07:00
Evgenii Kliuchnikov
61605b1cb3 pick VCPKG patches
PiperOrigin-RevId: 821593009
2025-10-20 04:44:24 -07:00
Evgenii Kliuchnikov
4b0f27b6f9 pick changes from Alpine patch
PiperOrigin-RevId: 816164347
2025-10-07 05:39:06 -07:00
Evgenii Kliuchnikov
1e4425a372 pick changes from Debian patch
PiperOrigin-RevId: 816157554
2025-10-07 05:16:20 -07:00
Copybara-Service
f038020bd7 Merge pull request #1346 from google:dependabot/github_actions/softprops/action-gh-release-2.3.4
PiperOrigin-RevId: 816151932
2025-10-07 04:56:02 -07:00
Copybara-Service
4d5a32bf45 Merge pull request #1347 from google:dependabot/github_actions/ossf/scorecard-action-2.4.3
PiperOrigin-RevId: 816151799
2025-10-07 04:54:57 -07:00
Evgenii Kliuchnikov
34e43eb020 fix typos
PiperOrigin-RevId: 815676548
2025-10-06 05:16:07 -07:00
Evgenii Kliuchnikov
b3142143f6 add installation section to README
PiperOrigin-RevId: 815667870
2025-10-06 04:45:22 -07:00
Evgenii Kliuchnikov
0e7ea31e6b add alternative unaligned memory access for MIPS
PiperOrigin-RevId: 815662268
2025-10-06 04:26:03 -07:00
Evgenii Kliuchnikov
da2e091eb7 prepare for v1.2.0.rc1
PiperOrigin-RevId: 815650799
2025-10-06 03:48:49 -07:00
dependabot[bot]
30576423b8 Bump ossf/scorecard-action from 2.4.2 to 2.4.3
Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.4.2 to 2.4.3.
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](05b42c6244...4eaacf0543)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-version: 2.4.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-06 09:30:15 +00:00
dependabot[bot]
9cf25439ad Bump softprops/action-gh-release from 2.3.3 to 2.3.4
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 2.3.3 to 2.3.4.
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](6cbd405e2c...62c96d0c4e)

---
updated-dependencies:
- dependency-name: softprops/action-gh-release
  dependency-version: 2.3.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-10-06 09:29:59 +00:00
Copybara-Service
82d3c163cb Merge pull request #1328 from akazwz:go
PiperOrigin-RevId: 815626477
2025-10-06 02:29:02 -07:00
Copybara-Service
4876ada111 Merge pull request #1335 from google:dependabot/github_actions/actions/cache-4.3.0
PiperOrigin-RevId: 815624406
2025-10-06 02:21:54 -07:00
Eugene Kliuchnikov
f1c80224e8 Fix some typos / non-typos. (#1345) 2025-10-03 12:16:07 +02:00
Evgenii Kliuchnikov
ed93810e27 support multi-phase initialization
PiperOrigin-RevId: 814128632
2025-10-02 01:51:02 -07:00
Eugene Kliuchnikov
7cc02a1687 Merge branch 'master' into dependabot/github_actions/actions/cache-4.3.0 2025-10-01 21:06:49 +02:00
Evgenii Kliuchnikov
54481d4ebe use builtin bswap when available
PiperOrigin-RevId: 813849285
2025-10-01 11:56:09 -07:00
Evgenii Kliuchnikov
a896e79d4f Java: ramp-up artifact versions in pom files
PiperOrigin-RevId: 813673237
2025-10-01 03:26:40 -07:00
Evgenii Kliuchnikov
947f74e908 update links in readme
PiperOrigin-RevId: 813665159
2025-10-01 02:58:49 -07:00
Evgenii Kliuchnikov
916e4a46a8 update docs
PiperOrigin-RevId: 813658707
2025-10-01 02:37:06 -07:00
Eugene Kliuchnikov
e4e56a3203 Add missing newline 2025-09-29 16:32:28 +02:00
Eugene Kliuchnikov
2c5f2d1198 Merge branch 'master' into go 2025-09-29 15:39:10 +02:00
Eugene Kliuchnikov
f3b0ceed2d Merge branch 'master' into dependabot/github_actions/actions/cache-4.3.0 2025-09-29 15:37:19 +02:00
Evgenii Kliuchnikov
1f6ab76bff use module-bound exception
PiperOrigin-RevId: 812739918
2025-09-29 05:12:31 -07:00
dependabot[bot]
5c79b32b14 Bump actions/cache from 4.2.4 to 4.3.0
Bumps [actions/cache](https://github.com/actions/cache) from 4.2.4 to 4.3.0.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](0400d5f644...0057852bfa)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: 4.3.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-29 09:19:37 +00:00
Copybara-Service
d74b0a4a22 Merge pull request #1323 from google:dependabot/github_actions/actions/setup-python-6.0.0
PiperOrigin-RevId: 811710346
2025-09-26 01:25:56 -07:00
Copybara-Service
dbcb332b66 Merge pull request #1324 from google:dependabot/github_actions/softprops/action-gh-release-2.3.3
PiperOrigin-RevId: 811710240
2025-09-26 01:24:57 -07:00
dependabot[bot]
c0d785dfe2 Bump actions/setup-python from 5.6.0 to 6.0.0
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.6.0 to 6.0.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](a26af69be9...e797f83bcb)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: 6.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-25 15:02:31 +00:00
dependabot[bot]
466613c266 Bump softprops/action-gh-release from 2.3.2 to 2.3.3
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 2.3.2 to 2.3.3.
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](72f2c25fcb...6cbd405e2c)

---
updated-dependencies:
- dependency-name: softprops/action-gh-release
  dependency-version: 2.3.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-09-25 15:01:52 +00:00
Evgenii Kliuchnikov
1406898440 Build and test with PY2.7
PiperOrigin-RevId: 811352084
2025-09-25 08:00:20 -07:00
Evgenii Kliuchnikov
0bef8a6936 clarify that prepared dictionaries are "lean"
PiperOrigin-RevId: 811236534
2025-09-25 01:27:28 -07:00
Evgenii Kliuchnikov
9686382ff3 PY: continue renovation of extension
Fixed unchecked malloc for "tail" input data.
Fixed inconsistencies in docstrings.

Rewritten "growable buffer" to C-code, so it could run without acquiring GIL.

Breaking changes:
 - native object allocation failures now handled at object creation time
 - some lower level exceptions (e.g. OOM) are not shadowed by brotli.error

PiperOrigin-RevId: 810813664
2025-09-24 03:52:44 -07:00
Evgenii Kliuchnikov
85d46ce6b5 Drop finalize()
Now it is solely embedders responisbility to close things that hold native resources. No more "safety net".

Consider "try-with-resources". For longer lasting items (e.g. native PreparedDictionary) use Cleaner as a last resort.

PiperOrigin-RevId: 807584792
2025-09-16 01:23:46 -07:00
akazwz
3d8eef20a6 Update Go modules to require Go 1.21 and replace ioutil with io package in reader.go 2025-09-12 23:52:50 +08:00
Evgenii Kliuchnikov
41a22f07f2 modernize PY3 class definition
PiperOrigin-RevId: 804460135
2025-09-08 09:15:53 -07:00
Evgenii Kliuchnikov
98a89b1563 temporary rollback
PiperOrigin-RevId: 803462595
2025-09-05 07:57:59 -07:00
Evgenii Kliuchnikov
35d4992ac8 PY: reformat _brotli.c
PiperOrigin-RevId: 803412055
2025-09-05 04:42:06 -07:00
Copybara-Service
3287b89bd4 Merge pull request #1257 from domysh:master
PiperOrigin-RevId: 803020625
2025-09-04 07:39:05 -07:00
Copybara-Service
4bf850a56a Merge pull request #1089 from SpaceIm:macos-relocatable
PiperOrigin-RevId: 803012803
2025-09-04 07:14:02 -07:00
Copybara-Service
0557c81d1e Merge pull request #1307 from google:dependabot/github_actions/actions/cache-4.2.4
PiperOrigin-RevId: 802995746
2025-09-04 06:14:21 -07:00
Eugene Kliuchnikov
f5e1f0641d Merge branch 'master' into macos-relocatable 2025-09-04 14:52:06 +02:00
Eugene Kliuchnikov
0903aad0d7 Update setup.py 2025-09-04 14:50:43 +02:00
Eugene Kliuchnikov
447385795f Update setup.py 2025-09-04 14:48:46 +02:00
Eugene Kliuchnikov
8fea9e31c8 Merge branch 'master' into master 2025-09-04 14:42:48 +02:00
Evgenii Kliuchnikov
20ed1374ff add decoder static init
PiperOrigin-RevId: 802957201
2025-09-04 03:58:47 -07:00
Evgenii Kliuchnikov
310f2119cf pull common static init header
PiperOrigin-RevId: 801763621
2025-09-01 04:46:12 -07:00
Evgenii Kliuchnikov
30c7d2f97d split prefix.h to .h/.cc/_inc.h
PiperOrigin-RevId: 801742168
2025-09-01 03:24:08 -07:00
Eugene Kliuchnikov
79ea7296d4 Merge branch 'master' into master 2025-08-29 15:06:37 +02:00
Eugene Kliuchnikov
7880aa1307 Merge branch 'master' into dependabot/github_actions/actions/cache-4.2.4 2025-08-29 14:48:15 +02:00
Evgenii Kliuchnikov
25190700e2 use BROTLI_COLD in enc
PiperOrigin-RevId: 800516878
2025-08-28 10:17:30 -07:00
Evgenii Kliuchnikov
cb29dec4ed Introduce BROTLI_COLD
PiperOrigin-RevId: 799674933
2025-08-26 12:48:16 -07:00
Evgenii Kliuchnikov
643b22949d AI ate my code
PiperOrigin-RevId: 799658697
2025-08-26 12:03:48 -07:00
Evgenii Kliuchnikov
e7b0c08b51 move bulky generated constants out of main code
PiperOrigin-RevId: 799083333
2025-08-25 05:36:04 -07:00
Evgenii Kliuchnikov
9a4ba5932b internal change
PiperOrigin-RevId: 795452145
2025-08-15 06:18:41 -07:00
Evgenii Kliuchnikov
3cc6172f58 uninline ShannonEntropy/BitsEntropy
PiperOrigin-RevId: 794966726
2025-08-14 03:50:40 -07:00
Evgenii Kliuchnikov
103b25fb1e explicitly specify model for relocatable variables
PiperOrigin-RevId: 794473371
2025-08-13 02:11:12 -07:00
dependabot[bot]
ea7d6c3737 Bump actions/cache from 4.2.3 to 4.2.4
Bumps [actions/cache](https://github.com/actions/cache) from 4.2.3 to 4.2.4.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](5a3ec84eff...0400d5f644)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: 4.2.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-11 13:48:51 +00:00
Evgenii Kliuchnikov
7b345944d6 adjust BROTLI_TEST effects
PiperOrigin-RevId: 793535397
2025-08-11 01:28:09 -07:00
Evgenii Kliuchnikov
6a4c96b110 more portable emergency exit
PiperOrigin-RevId: 792046166
2025-08-07 01:31:31 -07:00
Evgenii Kliuchnikov
29e040b8cb use static init to reduce encoder library size
PiperOrigin-RevId: 791661871
2025-08-06 06:18:01 -07:00
Evgenii Kliuchnikov
bf6231d685 Introduce static init
PiperOrigin-RevId: 791228365
2025-08-06 06:17:48 -07:00
Copybara-Service
377afe9e4f Merge pull request #1298 from Cycloctane:master
PiperOrigin-RevId: 791171172
2025-08-05 05:34:44 -07:00
Eugene Kliuchnikov
bf2e456437 Merge branch 'master' into master 2025-08-05 14:03:21 +02:00
Evgenii Kliuchnikov
8a7201c602 fix some includes
PiperOrigin-RevId: 791124445
2025-08-05 02:49:40 -07:00
Evgenii Kliuchnikov
12203bb586 Extract Hash14/15 to hash_base
PiperOrigin-RevId: 791061237
2025-08-04 23:30:28 -07:00
Evgenii Kliuchnikov
172fe58fab Forward imports in types.h
PiperOrigin-RevId: 790683957
2025-08-04 04:03:20 -07:00
Cycloctane
bf88e69b29 Update CHANGELOG.md 2025-07-29 22:05:14 +08:00
Evgenii Kliuchnikov
a47d747506 Roll back:
Enable shared_dictionary for quality 3 and 4.

PiperOrigin-RevId: 781976993
2025-07-11 07:39:05 -07:00
Copybara-Service
564ddad5ac Merge pull request #1270 from google:dependabot/github_actions/ossf/scorecard-action-2.4.2
PiperOrigin-RevId: 781920549
2025-07-11 04:00:51 -07:00
Copybara-Service
51a9318883 Merge pull request #1281 from google:dependabot/github_actions/softprops/action-gh-release-2.3.2
PiperOrigin-RevId: 781919147
2025-07-11 03:54:31 -07:00
Copybara-Service
400b1e1e6c Merge pull request #1294 from lowkeyrossi:support_arm64
PiperOrigin-RevId: 781918885
2025-07-11 03:53:27 -07:00
Eugene Kliuchnikov
5f84f3b412 Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.4.2 2025-07-11 12:49:30 +02:00
Eugene Kliuchnikov
39d91926a1 Merge branch 'master' into dependabot/github_actions/softprops/action-gh-release-2.3.2 2025-07-11 12:48:51 +02:00
newyork_loki
15fb8bfedc Merge branch 'master' into support_arm64 2025-07-11 07:55:55 +05:30
Copybara-Service
b80c3ad14f Merge pull request #1292 from radarhere:patch-1
PiperOrigin-RevId: 781416466
2025-07-10 01:16:46 -07:00
Eugene Kliuchnikov
8ab7027a3a Merge branch 'master' into patch-1 2025-07-09 22:15:39 +02:00
newyork_loki
61ebc6cbad Add support for release builds for Windows ARM64 2025-07-09 18:02:34 +05:30
newyork_loki
ab9b7b07c7 Add support for release builds for Windows ARM64 2025-07-09 18:00:11 +05:30
Brotli
42c5139b80 Enable shared_dictionary for quality 3 and 4.
PiperOrigin-RevId: 780946946
2025-07-09 02:34:01 -07:00
Andrew Murray
81fb9d557c Fixed typo 2025-07-08 22:47:36 +10:00
Brotli
434b582d49 Fix compilation errors when BROTLI_DEBUG is defined.
PiperOrigin-RevId: 777494216
2025-06-30 03:32:03 -07:00
Eugene Kliuchnikov
5a7338e952 Merge branch 'master' into dependabot/github_actions/softprops/action-gh-release-2.3.2 2025-06-26 14:40:01 +02:00
Copybara-Service
f2022b9ea3 Merge pull request #1288 from radarhere:patch-1
PiperOrigin-RevId: 776058396
2025-06-26 03:05:10 -07:00
Eugene Kliuchnikov
fe2d7039aa Merge branch 'master' into patch-1 2025-06-26 11:41:31 +02:00
Evgenii Kliuchnikov
2903f45e7d ignore slices that cross sample boundary
PiperOrigin-RevId: 775230773
2025-06-24 07:48:39 -07:00
Andrew Murray
63270dbfc0 Fixed typo 2025-06-24 18:30:02 +10:00
Brotli
4cb7828376 [brotli/go] Add missing importpath BUILD.bazel
Otherwise the package cannot be imported correctly in open source. This CL allows the brotli and cbrotli tests to pass correctly with bazel (open-source version of blaze).

PiperOrigin-RevId: 772917739
2025-06-18 07:46:47 -07:00
Brotli
418237cff4 No public description
PiperOrigin-RevId: 772905412
2025-06-18 07:07:05 -07:00
Evgenii Kliuchnikov
c02288edf3 fix test package
PiperOrigin-RevId: 772895417
2025-06-18 06:32:30 -07:00
Evgenii Kliuchnikov
b15ca9af07 pure golang decoder
PiperOrigin-RevId: 772827931
2025-06-18 02:19:02 -07:00
Evgenii Kliuchnikov
7bd1bd4463 add synth_test for cbrotli
PiperOrigin-RevId: 772439037
2025-06-17 06:02:25 -07:00
dependabot[bot]
7a08d69430 Bump softprops/action-gh-release from 2.2.2 to 2.3.2
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 2.2.2 to 2.3.2.
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](da05d55257...72f2c25fcb)

---
updated-dependencies:
- dependency-name: softprops/action-gh-release
  dependency-version: 2.3.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-16 08:22:07 +00:00
Domingo Dirutigliano
964ca075d1 fix module import in set exception + setup.py compatible with python2 2025-06-13 14:58:13 +02:00
Domingo Dirutigliano
ddb803acc3 using static function 2025-06-13 14:13:12 +02:00
Domingo Dirutigliano
998a28bde4 using module state for error handling 2025-06-13 14:09:51 +02:00
Evgenii Kliuchnikov
3efb30f96b Comb UTF-8 literal processing
PiperOrigin-RevId: 770672448
2025-06-12 08:48:11 -07:00
Evgenii Kliuchnikov
405b434d85 make cbrotli_test extenal to cbrotli
PiperOrigin-RevId: 770669431
2025-06-12 08:39:36 -07:00
Eugene Kliuchnikov
3d469dc62f Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.4.2 2025-06-11 13:07:19 +02:00
Evgenii Kliuchnikov
00ad1d4a3f CI: don't use deprecated Windows-2019 image
PiperOrigin-RevId: 770061926
2025-06-11 03:18:14 -07:00
Evgenii Kliuchnikov
ee5f3bb959 Refresh JS/TS/KT
PiperOrigin-RevId: 769617330
2025-06-10 07:08:25 -07:00
Evgenii Kliuchnikov
08bdaebbf8 modify Java decoder in a way it could be transpiled to exception unfriendly languages
PiperOrigin-RevId: 769488037
2025-06-10 00:28:15 -07:00
Eugene Kliuchnikov
bddbb9660c Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.4.2 2025-06-10 09:22:47 +02:00
Evgenii Kliuchnikov
271be11429 Prepare for transpilation to golang
PiperOrigin-RevId: 767024321
2025-06-04 01:05:06 -07:00
Evgenii Kliuchnikov
9c91b6a295 Restore bazel-win:go pipeline
PiperOrigin-RevId: 766158407
2025-06-02 05:57:44 -07:00
Eugene Kliuchnikov
ff59b386c6 Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.4.2 2025-06-02 12:54:29 +02:00
Brotli
00c6df4d2f Include go bindings in archive export
PiperOrigin-RevId: 766107843
2025-06-02 02:56:27 -07:00
dependabot[bot]
781b865c92 Bump ossf/scorecard-action from 2.4.1 to 2.4.2
Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.4.1 to 2.4.2.
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](f49aabe0b5...05b42c6244)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-version: 2.4.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-02 08:42:09 +00:00
Copybara-Service
e095150b91 Merge pull request #1182 from eugo-inc:master
PiperOrigin-RevId: 765091808
2025-05-30 02:00:59 -07:00
Benjamin Leff
dcc24d051d chore: fixing trailing whitespace in python/README.md 2025-05-28 14:13:46 -05:00
Benjamin Leff
5189ec1f56 Merge branch 'google:master' into master 2025-05-28 14:11:45 -05:00
Evgenii Kliuchnikov
6270bb54fa update CI pipelines
PiperOrigin-RevId: 764197690
2025-05-28 04:18:12 -07:00
Benjamin Leff
0e0ed5f0cc perf: conditionally install pkgconfig if using system brotli 2025-05-27 08:53:51 -05:00
Benjamin Leff
9b7beb425d Merge branch 'google:master' into master 2025-05-27 08:06:30 -05:00
Robin Watts
cecc0acc0b Fix ISO C build breakage. (#1255)
ISO C prohibits inline declarations of variables. Move the
declaration to the start of the block.

Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
2025-05-27 09:59:17 +02:00
Richard Hughes
e9668c8cb0 Add a SBOM template in CycloneDX format (#1224)
Improve supply chain security by including a SBOM file with substituted values.

This will be used to construct a composite platform SBOM.

Signed-off-by: Richard Hughes <rhughes@redhat.com>
Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
2025-05-27 09:50:34 +02:00
Andreas Deininger
93d0ac53aa Fix typos (#1242)
Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
2025-05-27 09:47:01 +02:00
Christian Clauss
3487b6917d Update bro.py: Fix undefined name in Python code (#1240)
`outfile` is used before it is defined on the following line so use `options.outfile` instead.

% `ruff check`
```
Error: python/bro.py:168:64: F821 Undefined name `outfile`
```

Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
2025-05-27 09:37:03 +02:00
dependabot[bot]
adc546b73a Bump ossf/scorecard-action from 2.4.0 to 2.4.1 (#1252)
Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.4.0 to 2.4.1.
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](62b2cac7ed...f49aabe0b5)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
2025-05-27 09:31:34 +02:00
dependabot[bot]
b3d7283f96 Bump actions/cache from 4.1.2 to 4.2.3 (#1258)
Bumps [actions/cache](https://github.com/actions/cache) from 4.1.2 to 4.2.3.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](6849a64899...5a3ec84eff)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
2025-05-27 09:21:49 +02:00
dependabot[bot]
fa308ddae5 Bump actions/upload-artifact from 4.4.3 to 4.6.2 (#1259)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.4.3 to 4.6.2.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](b4b15b8c7c...ea165f8d65)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
2025-05-27 09:19:06 +02:00
dependabot[bot]
98cde63332 Bump softprops/action-gh-release from 2.1.0 to 2.2.2 (#1263)
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 2.1.0 to 2.2.2.
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](01570a1f39...da05d55257)

---
updated-dependencies:
- dependency-name: softprops/action-gh-release
  dependency-version: 2.2.2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eugene Kliuchnikov <eustas.ru@gmail.com>
2025-05-27 09:14:31 +02:00
dependabot[bot]
9c55070564 Bump actions/setup-python from 5.3.0 to 5.6.0 (#1265)
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.3.0 to 5.6.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](0b93645e9f...a26af69be9)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: 5.6.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-27 08:53:21 +02:00
Domingo Dirutigliano
e45f18ccca DECREF on exception raise 2025-03-15 11:23:32 +01:00
Domingo Dirutigliano
c710db0510 enabled support to 3.12 per GIL interpeter using multistage module inizialization 2025-03-15 09:13:46 +01:00
Evgenii Kliuchnikov
440e03642b Another nullptr-arithmetics clamer
PiperOrigin-RevId: 721741817
2025-01-31 05:43:09 -08:00
Evgenii Kliuchnikov
a1e3ab25ce Fix (speculative) nullptr arithmetic
PiperOrigin-RevId: 721739274
2025-01-31 05:34:24 -08:00
Copybara-Service
28b1183715 Merge pull request #1244 from eustas:reno
PiperOrigin-RevId: 716603037
2025-01-17 03:39:55 -08:00
Evgenii Kliuchnikov
ad3c0d5f47 Add release notes for recent python API update 2025-01-17 09:07:22 +00:00
Copybara-Service
67d78bc41d Merge pull request #1234 from robryk:sizelimit
PiperOrigin-RevId: 713282049
2025-01-08 07:18:31 -08:00
Eugene Kliuchnikov
d144c5833a Merge branch 'master' into sizelimit 2025-01-08 12:52:55 +01:00
Copybara-Service
b01b63a467 Merge pull request #1232 from eustas:fixp1
PiperOrigin-RevId: 713231604
2025-01-08 03:41:01 -08:00
Eugene Kliuchnikov
4034fe3654 Merge branch 'master' into sizelimit 2025-01-08 11:50:52 +01:00
Evgenii Kliuchnikov
281b0aa562 Fix most of build_test pipeline 2025-01-08 10:12:41 +00:00
Evgenii Kliuchnikov
57610b7174 translate includes in brotli/research
PiperOrigin-RevId: 713033033
2025-01-07 13:59:40 -08:00
Robert Obryk
eb3a31e2d3 add max_length to Python streaming decompression 2025-01-07 16:28:30 +01:00
Robert Obryk
28ce91caf6 add size limit to buffer 2025-01-07 16:28:30 +01:00
Evgenii Kliuchnikov
7ef8b920dc copybara config update to export /MODULE.bazel
PiperOrigin-RevId: 712795247
2025-01-07 00:07:39 -08:00
Evgenii Kliuchnikov
95b81fcc29 Partially pick https://github.com/google/brotli/pull/1232
PiperOrigin-RevId: 712791222
2025-01-06 23:52:23 -08:00
Brotli
d019271c8b Copybara import of the project:
--
f1bdfaa803 by Robert Obryk <robryk@google.com>:

add size limit to buffer

--
ef8922cee7 by Robert Obryk <robryk@google.com>:

add max_length to Python streaming decompression

PiperOrigin-RevId: 712463460
2025-01-06 03:05:56 -08:00
Copybara-Service
ef9e12f004 Merge pull request #1201 from robryk:sizelimit
PiperOrigin-RevId: 712456783
2025-01-06 02:37:01 -08:00
Robert Obryk
ef8922cee7 add max_length to Python streaming decompression 2025-01-06 11:26:42 +01:00
Copybara-Service
91d96d3d93 Merge pull request #1226 from google:dependabot/github_actions/softprops/action-gh-release-2.1.0
PiperOrigin-RevId: 699885941
2024-11-25 01:12:58 -08:00
dependabot[bot]
cf25e52f9b Bump softprops/action-gh-release from 2.0.2 to 2.1.0
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 2.0.2 to 2.1.0.
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](d99959edae...01570a1f39)

---
updated-dependencies:
- dependency-name: softprops/action-gh-release
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-25 09:03:34 +00:00
Copybara-Service
841f672884 Merge pull request #1137 from google:dependabot/github_actions/softprops/action-gh-release-2
PiperOrigin-RevId: 698755745
2024-11-21 06:22:06 -08:00
Evgenii Kliuchnikov
5c66724029 Add dictionary API to cgo wrapper
PiperOrigin-RevId: 698745795
2024-11-21 05:39:04 -08:00
gorloffslava
0ea9927a2b + Improved setup.py formatting
+ Updated dependency resolution for edge cases in setup.py
2024-11-21 16:11:26 +05:00
Slava Gorlov
3a444a6f2d Merge branch 'google:master' into master 2024-11-21 12:26:45 +05:00
Copybara-Service
39904bdfe8 Merge pull request #1185 from dloebl:cbrotli-add-pkg-config-directive
PiperOrigin-RevId: 698329603
2024-11-20 03:26:24 -08:00
Evgenii Kliuchnikov
2dfaadcef3 (PY) clarify compressor mode parameter values
PiperOrigin-RevId: 698023020
2024-11-19 08:15:20 -08:00
dependabot[bot]
b08dc48782 Bump softprops/action-gh-release from 1 to 2
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 1 to 2.
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](de2c0eb89a...d99959edae)

---
updated-dependencies:
- dependency-name: softprops/action-gh-release
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-19 14:57:49 +01:00
Daniel Lobl
90833a88cd cbrotli: add pkg-config directive 2024-11-19 13:11:56 +01:00
Copybara-Service
2b6efcbdcc Merge pull request #1204 from heshpdx:master
PiperOrigin-RevId: 697922880
2024-11-19 01:42:21 -08:00
Mahesh Madhav
8c6d25f7f8 Update c/enc/encode.c
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-11-19 09:52:16 +01:00
Mahesh Madhav
782aadd0ff Apply suggestions from code review
Co-authored-by: Eugene Kliuchnikov <eustas@google.com>
2024-11-19 09:52:16 +01:00
Mahesh Madhav
1054ecc262 Add static variables as per code review comments. 2024-11-19 09:52:16 +01:00
Mahesh Madhav
cec846f88e Update c/enc/block_splitter_inc.h
Added a digit of precision
2024-11-19 09:52:16 +01:00
Mahesh Madhav
cefec3ce9d Reduce fdiv's into fmul's
Provides small speedup on microarchitectures where the floating
point divide is slower than the floating point multiply.
2024-11-19 09:52:16 +01:00
Copybara-Service
f25050c0d0 Merge pull request #1190 from jkoritzinsky:warnings-cleanup
PiperOrigin-RevId: 696853690
2024-11-15 05:21:08 -08:00
Copybara-Service
d405b7f114 Merge pull request #1181 from google:dependabot/github_actions/ossf/scorecard-action-2.4.0
PiperOrigin-RevId: 696848167
2024-11-15 04:55:26 -08:00
Eugene Kliuchnikov
0e37f9391d Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.4.0 2024-11-12 16:55:37 +01:00
Jeremy Koritzinsky
aa54821999 Fix C4224 warnings when building with MSVC 2024-11-12 16:50:52 +01:00
Copybara-Service
a4d0581dfd Merge pull request #1210 from google:dependabot/github_actions/actions/upload-artifact-4.4.3
PiperOrigin-RevId: 695721243
2024-11-12 07:15:57 -08:00
dependabot[bot]
c766253ccc Bump actions/upload-artifact from 4.0.0 to 4.4.3
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4.0.0 to 4.4.3.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](c7d193f32e...b4b15b8c7c)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-12 14:02:26 +00:00
Copybara-Service
45f4cc75bf Merge pull request #1214 from google:dependabot/github_actions/actions/cache-4.1.2
PiperOrigin-RevId: 695700673
2024-11-12 06:01:25 -08:00
Copybara-Service
ca43fd5d99 Merge pull request #1215 from google:dependabot/github_actions/actions/setup-python-5.3.0
PiperOrigin-RevId: 695700558
2024-11-12 06:00:23 -08:00
dependabot[bot]
d3471e6ff8 Bump actions/cache from 3.3.2 to 4.1.2
Bumps [actions/cache](https://github.com/actions/cache) from 3.3.2 to 4.1.2.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](704facf57e...6849a64899)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-12 13:23:35 +00:00
dependabot[bot]
cb63a61918 Bump actions/setup-python from 5.0.0 to 5.3.0
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5.0.0 to 5.3.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](0a5c615913...0b93645e9f)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-11-12 13:23:33 +00:00
Copybara-Service
28482c4024 Merge pull request #1221 from eustas:matJ
PiperOrigin-RevId: 695692377
2024-11-12 05:22:30 -08:00
Evgenii Kliuchnikov
4c4b1297c6 "Hermetise" bazel-java tests
Should fix windows pipeline.
2024-11-12 13:10:45 +00:00
Copybara-Service
f2372f2f57 Merge pull request #1218 from eustas:fijab
PiperOrigin-RevId: 695636494
2024-11-12 01:19:44 -08:00
Evgenii Kliuchnikov
7347e81db5 Fix Java Bazel build 2024-11-11 15:54:47 +00:00
Evgenii Kliuchnikov
4303850b01 No public description
PiperOrigin-RevId: 695336284
2024-11-11 07:51:35 -08:00
gorloffslava
1c380db165 Merge branch 'google:master' into master 2024-10-31 16:05:56 +05:00
Ilya Tokar
664952333f Make Brotli decompression faster
Makes it ~8% faster on my skylake desktop.

PiperOrigin-RevId: 689499172
2024-10-24 13:36:55 -07:00
gorloffslava
9018ef374f Merge branch 'google:master' into master 2024-10-16 10:57:44 +05:00
Ilya Tokar
350100a5bb Add BrotliCopyPreloadedSymbols function.
Add a single trivial use to avoid complier warning.

PiperOrigin-RevId: 676435629
2024-09-19 09:02:34 -07:00
Robert Obryk
f1bdfaa803 add size limit to buffer 2024-09-17 16:54:59 +02:00
Benjamin Leff
b93502da88 fix: renamed print 2024-08-29 19:17:37 -05:00
gorloffslava
2a01fd8f31 + Added ability to build Brotli Python bindings against system-provided brotli instead of vendored one 2024-07-29 20:05:34 +05:00
dependabot[bot]
3b83a05447 Bump ossf/scorecard-action from 2.3.1 to 2.4.0
Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.3.1 to 2.4.0.
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](0864cf1902...62b2cac7ed)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-07-29 08:39:42 +00:00
Brotli
39bcecf455 Fix hasher resolution for long windows.
PiperOrigin-RevId: 652545288
2024-07-15 11:27:36 -07:00
Brotli
a528bce9f6 Hoist the static bounds check out of the combined if check.
PiperOrigin-RevId: 639054702
2024-05-31 09:11:23 -07:00
Brotli
fe754f3459 Use a hash table header and SIMD to speed up hash table operations (similar to [Swiss Tables](https://abseil.io/about/design/swisstables)).
PiperOrigin-RevId: 638686412
2024-05-30 09:51:43 -07:00
Michael Hoisie
8a626fd486 No public description
PiperOrigin-RevId: 636183145
2024-05-22 08:54:21 -07:00
Brotli
04388304a6 Use a hash table header and SIMD to speed up hash table operations (similar to [Swiss Tables](https://abseil.io/about/design/swisstables)).
PiperOrigin-RevId: 632238409
2024-05-09 12:42:42 -07:00
Brotli
bb809ac908 Use a hash table header and SIMD to speed up hash table operations (similar to [Swiss Tables](https://abseil.io/about/design/swisstables)).
PiperOrigin-RevId: 631982664
2024-05-08 17:59:58 -07:00
Brotli
d01a4caaa8 Internal change
PiperOrigin-RevId: 626960053
2024-04-22 02:07:25 -07:00
Brotli
1b3a5ccb6e Prefetch the backreference hashtable bucket.
Place the prefetch before the last distance checks, to give the prefetch enough time to work.

PiperOrigin-RevId: 626228820
2024-04-18 20:00:02 -07:00
Evgenii Kliuchnikov
443af10a80 add (assumption) check
PiperOrigin-RevId: 625632989
2024-04-17 04:10:04 -07:00
Evgenii Kliuchnikov
c1c76e993f Don't check cur_ix_masked against ring_buffer_mask.
`cur_ix_masked` isn't changing from iteration to iteration, and `max_length` ensures we never find a match long enough to walk off the ring buffer.

PiperOrigin-RevId: 624701948
2024-04-14 06:36:02 -07:00
Brotli
709c4672d4 Fix minor syntax issues.
Missing semicolons.
Move checks below variable declarations for c89.

PiperOrigin-RevId: 624199887
2024-04-12 09:16:00 -07:00
Brotli
a76d96e730 Don't check cur_ix_masked against ring_buffer_mask.
`cur_ix_masked` isn't changing from iteration to iteration, and `max_length` ensures we never find a match long enough to walk off the ring buffer.

PiperOrigin-RevId: 624162764
2024-04-12 06:50:51 -07:00
Brotli
a813a6a1e4 Update the H5 hasher to use the H6's FN(STORE).
PiperOrigin-RevId: 623885589
2024-04-11 11:23:47 -07:00
Brotli
f964a1e8ac Internal change
PiperOrigin-RevId: 623073126
2024-04-09 00:19:11 -07:00
Brotli
cdbe7fc739 Internal change
PiperOrigin-RevId: 622802698
2024-04-08 04:30:44 -07:00
Brotli
b6f2d49feb Add load() statements for the builtin Bazel java rules
Loads are being added in preparation for moving the rules out of Bazel and into `rules_java`.

PiperOrigin-RevId: 621489058
2024-04-03 05:18:26 -07:00
Brotli
9351fa7ffb Compare 4 bytes when checking if a longer match is possible.
Loading and comparing 4 bytes is ~as fast as 1 byte, but allows us to avoid more full match length calculation.

PiperOrigin-RevId: 617556847
2024-03-20 10:30:00 -07:00
Brotli
9717649c31 Use BROTLI_MAX_STATIC_CONTEXTS instead of magic constants in encode.c
PiperOrigin-RevId: 615341475
2024-03-13 02:47:52 -07:00
Evgenii Kliuchnikov
ccec9628e4 add pure-kotlin decoder
PiperOrigin-RevId: 608917286
2024-02-21 02:33:05 -08:00
Evgenii Kliuchnikov
c1362a7903 further preparations for Kotlin transpilation
PiperOrigin-RevId: 603638823
2024-02-02 03:26:50 -08:00
Evgenii Kliuchnikov
200f37984a prepare java decoder for transpilation to Kotlin
PiperOrigin-RevId: 601023149
2024-01-23 23:47:13 -08:00
Evgenii Kliuchnikov
d5e697b3c7 remove dependency on os-specific defines
PiperOrigin-RevId: 600449944
2024-01-22 07:24:41 -08:00
Evgenii Kliuchnikov
adbc354d23 simplify log2 check; currently we rely more on compiler than build system
PiperOrigin-RevId: 598794971
2024-01-16 04:02:00 -08:00
Evgenii Kliuchnikov
02458f3443 further simplify Java build
PiperOrigin-RevId: 598790414
2024-01-16 03:37:10 -08:00
Evgenii Kliuchnikov
3396c67fea add brcat alias + flag to decompress concatenated streams
PiperOrigin-RevId: 598652401
2024-01-15 12:49:56 -08:00
Evgenii Kliuchnikov
033940f97c add comment (fingerprint) CLI feature
PiperOrigin-RevId: 597489910
2024-01-11 02:04:37 -08:00
Evgenii Kliuchnikov
2ad58d8603 use .bazelignore instead of fake repositories
PiperOrigin-RevId: 595931804
2024-01-05 01:57:44 -08:00
Copybara-Service
26b1fec26b Merge pull request #1103 from google:dependabot/github_actions/actions/upload-artifact-4.0.0
PiperOrigin-RevId: 595711813
2024-01-04 08:26:08 -08:00
Evgenii Kliuchnikov
1045ab52df Fix/simplify/improve Bazel build
PiperOrigin-RevId: 595656443
2024-01-04 03:33:11 -08:00
Eugene Kliuchnikov
3bd5b9c0a2 Merge branch 'master' into dependabot/github_actions/actions/upload-artifact-4.0.0 2024-01-04 10:00:13 +01:00
Evgenii Kliuchnikov
082c9626a4 add test for one-shot encoding/decoding with offset
PiperOrigin-RevId: 595407007
2024-01-03 08:18:52 -08:00
dependabot[bot]
2b3334d559 Bump actions/upload-artifact from 3.1.3 to 4.0.0
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3.1.3 to 4.0.0.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](a8a3f3ad30...c7d193f32e)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-03 14:59:27 +00:00
Copybara-Service
fa084310d1 Merge pull request #1102 from google:dependabot/github_actions/actions/setup-python-5.0.0
PiperOrigin-RevId: 595390609
2024-01-03 06:58:46 -08:00
Copybara-Service
0ef82f0c0d Merge pull request #1104 from hyperxpro:encode-fix
PiperOrigin-RevId: 595388650
2024-01-03 06:49:22 -08:00
Eugene Kliuchnikov
79a5e80a59 Merge branch 'master' into encode-fix 2024-01-03 13:34:22 +01:00
dependabot[bot]
7cf649decd Bump actions/setup-python from 4.7.1 to 5.0.0
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4.7.1 to 5.0.0.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](65d7f2d534...0a5c615913)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-03 10:51:23 +00:00
Evgenii Kliuchnikov
4c57a6484b drop Bazel JS build
PiperOrigin-RevId: 595345529
2024-01-03 02:50:41 -08:00
Evgenii Kliuchnikov
6b6adb7ae8 fix build for Microsoft-designed ARM64 ABI
PiperOrigin-RevId: 595334901
2024-01-03 02:01:23 -08:00
Aayush Atharva
428d056ddc Fix Encoder bug 2023-12-28 22:19:02 +05:30
Eugene Kliuchnikov
6046f00d41 Merge branch 'master' into macos-relocatable 2023-12-08 16:44:28 +01:00
Copybara-Service
fef82ea104 Merge pull request #1091 from google:dependabot/github_actions/actions/setup-python-4.7.1
PiperOrigin-RevId: 589126376
2023-12-08 07:36:21 -08:00
Eugene Kliuchnikov
96b255b95c Merge branch 'master' into dependabot/github_actions/actions/setup-python-4.7.1 2023-12-08 16:31:08 +01:00
Copybara-Service
0d1a0a4dfd Merge pull request #1095 from google:dependabot/github_actions/ossf/scorecard-action-2.3.1
PiperOrigin-RevId: 589124584
2023-12-08 07:27:13 -08:00
Eugene Kliuchnikov
a6eacaa3e3 Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.3.1 2023-12-08 16:22:45 +01:00
Copybara-Service
421be80782 Merge pull request #1084 from trofi:brotli-cmake-tweaks
PiperOrigin-RevId: 589121646
2023-12-08 07:12:58 -08:00
Eugene Kliuchnikov
adac2b0e7d Merge branch 'master' into brotli-cmake-tweaks 2023-12-08 15:39:58 +01:00
Eugene Kliuchnikov
bf867c126b Merge branch 'master' into dependabot/github_actions/actions/setup-python-4.7.1 2023-12-08 15:35:35 +01:00
Eugene Kliuchnikov
a1851fe3f7 Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.3.1 2023-12-08 15:35:19 +01:00
Evgenii Kliuchnikov
6ba678a7ce pull "InputStream" reference out of "pure" code
PiperOrigin-RevId: 586390725
2023-11-29 10:48:18 -08:00
Eugene Kliuchnikov
563078a462 Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.3.1 2023-11-27 15:41:43 +01:00
Evgenii Kliuchnikov
0dff3e5b0d fix CI workflows
PiperOrigin-RevId: 585630137
2023-11-27 06:09:48 -08:00
Eugene Kliuchnikov
c536542bc7 Merge branch 'master' into dependabot/github_actions/ossf/scorecard-action-2.3.1 2023-11-27 12:25:28 +01:00
Evgenii Kliuchnikov
2b6d8654d4 add an option to disable brotli tools
PiperOrigin-RevId: 585593185
2023-11-27 03:13:14 -08:00
Eugene Kliuchnikov
8b0e23084c Merge branch 'master' into macos-relocatable 2023-11-03 11:45:54 +01:00
dependabot[bot]
0adb12e0a4 Bump ossf/scorecard-action from 2.2.0 to 2.3.1
Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.2.0 to 2.3.1.
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](08b4669551...0864cf1902)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-30 09:03:11 +00:00
Evgenii Kliuchnikov
9b83be233e fix wording
PiperOrigin-RevId: 576788685
2023-10-26 02:03:20 -07:00
dependabot[bot]
4855abb020 Bump actions/setup-python from 4.7.0 to 4.7.1
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4.7.0 to 4.7.1.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](61a6322f88...65d7f2d534)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-10-09 08:18:41 +00:00
SpaceIm
75ea7317ef relocatable shared libs for macOS 2023-09-28 01:18:31 +02:00
Sergei Trofimovich
cff5803216 CMakeLists.txt: use CMAKE_INSTALL_FULL_MANDIR for mans install
Without the change install just fails for me as `SHARE_INSTALL_PREFIX`
is unset for me.

Following https://cmake.org/cmake/help/latest/module/GNUInstallDirs.html
I'm using absolute path expansion to install mans.
2023-09-21 16:39:53 +01:00
Sergei Trofimovich
3ad47114b8 CMakeLists.txt: use CMAKE_INSTALL_FULL_LIBDIR for runpath on darwin
Without the change on systems where `CMAKE_INSTALL_LIBDIR` is an
absolute path outside `CMAKE_INSTALL_PREFIX` (like `nixpkgs`) libraries
ended up embedding wrong RPATH and libraries failed to load.

The change uses suggestion from https://cmake.org/cmake/help/latest/module/GNUInstallDirs.html
to use `CMAKE_INSTALL_FULL_LIBDIR` (similar to library install code)
to enbed it as an RPATH.
2023-09-21 16:39:53 +01:00
Copybara-Service
53947c15f5 Merge pull request #1086 from google:dependabot/github_actions/actions/upload-artifact-3.1.3
PiperOrigin-RevId: 566563985
2023-09-19 02:21:54 -07:00
dependabot[bot]
662b00ee63 Bump actions/upload-artifact from 3.1.0 to 3.1.3
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3.1.0 to 3.1.3.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v3.1.0...a8a3f3ad30e3422c9c7b888a15615d19a852ae32)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-18 08:55:54 +00:00
Evgenii Kliuchnikov
ce9c16e882 upload full testdata archive
PiperOrigin-RevId: 565017690
2023-09-13 05:13:57 -07:00
Evgenii Kliuchnikov
63402aa8af use sha-versions for most gh actions
PiperOrigin-RevId: 564692809
2023-09-12 05:49:37 -07:00
Copybara-Service
91d1b2d623 Merge pull request #1079 from google:dependabot/github_actions/actions/checkout-4
PiperOrigin-RevId: 564669791
2023-09-12 03:46:40 -07:00
dependabot[bot]
c308b90e7b Bump actions/checkout from 3 to 4
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-12 06:32:12 +00:00
Copybara-Service
9da1c56448 Merge pull request #1080 from google:dependabot/github_actions/ossf/scorecard-action-2.2.0
PiperOrigin-RevId: 564616141
2023-09-11 23:31:20 -07:00
dependabot[bot]
cd158a41f4 Bump ossf/scorecard-action from 2.1.2 to 2.2.0
Bumps [ossf/scorecard-action](https://github.com/ossf/scorecard-action) from 2.1.2 to 2.2.0.
- [Release notes](https://github.com/ossf/scorecard-action/releases)
- [Changelog](https://github.com/ossf/scorecard-action/blob/main/RELEASE.md)
- [Commits](e38b1902ae...08b4669551)

---
updated-dependencies:
- dependency-name: ossf/scorecard-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-11 15:32:08 +00:00
Evgenii Kliuchnikov
39527d4a3c add dependabot
PiperOrigin-RevId: 564393796
2023-09-11 08:31:36 -07:00
Evgenii Kliuchnikov
cf95fbb9c5 reword cmake test generator warning
PiperOrigin-RevId: 564371898
2023-09-11 07:08:50 -07:00
Evgenii Kliuchnikov
e8569f79fc test building from the tarball
PiperOrigin-RevId: 564299396
2023-09-11 01:11:59 -07:00
Copybara-Service
896ea7a9a9 Merge pull request #1070 from kloczek:master
PiperOrigin-RevId: 563753673
2023-09-08 08:03:58 -07:00
Eugene Kliuchnikov
7561c2d847 Merge branch 'master' into master 2023-09-07 12:19:02 +02:00
Cosimo Lupo
2ce85662c5 setup.py: add long_description (#1073)
twine (the tool we use to upload packages to PyPI) is currently failing if the long_description (used to render a project's page on PyPI website) is not set. Somehow it complains that it is not well formatted reStructuredText, but it's simply empty...
This looks like a bug, or bad interaction between twince and setuptools, because the field is technically optional.
Also see https://github.com/pypa/twine/issues/960 and https://github.com/pypa/twine/issues/908.

This issue is currently preventing the upload of newly built Brotli v1.1.0 Python wheels to PyPI:
https://github.com/google/brotli-wheels/issues/18#issuecomment-1706910190

Anyway, we may well set the long_description to the content of the README.md file, as it's customary for python projects.
2023-09-07 11:28:09 +02:00
Tomasz Kłoczko
741610efd3 install man pages
cmake modyfication to install man pages.

Signed-off-by: Tomasz Kłoczko <kloczek@github.com>
2023-08-31 08:47:47 +00:00
Evgenii Kliuchnikov
ed738e842d more sophisticated golang TestEncoderFlush
PiperOrigin-RevId: 560982956
2023-08-29 04:00:29 -07:00
Evgenii Kliuchnikov
e7313b0c4e tune memory manager for BROTLI_EXPERIMENTAL
PiperOrigin-RevId: 560703386
2023-08-28 07:20:49 -07:00
Evgenii Kliuchnikov
c1bd196833 comb HAVE_UTIMENSAT definition
PiperOrigin-RevId: 560011681
2023-08-25 01:07:12 -07:00
Evgenii Kliuchnikov
2a5a088b03 more tuning for BROTLI_EXPERIMENTAL + clean-on-oom
PiperOrigin-RevId: 558771745
2023-08-21 06:36:24 -07:00
Evgenii Kliuchnikov
feb6d8bc80 prepare for 1.1.0rc
PiperOrigin-RevId: 558736892
2023-08-21 03:35:01 -07:00
Evgenii Kliuchnikov
3ebb2d30ab Move serialized dictionary feature behind the flag.
BROTLI_SHARED_DICTIONARY_SERIALIZED enum value is a part of API,
but it should not be used (will cause failures).
Changing how serialized dictionaries work won't be considered as an API change, until this feature is enabled.
Enabling this feature in the future will be considered as a "compatible" change.

PiperOrigin-RevId: 558091676
2023-08-18 02:55:33 -07:00
Evgenii Kliuchnikov
0f2157cc5e Update comment; fixes #1061
PiperOrigin-RevId: 557501089
2023-08-16 08:55:14 -07:00
Evgenii Kliuchnikov
9ff341daaf Replace TS strict_checks with source-level suppressions.
PiperOrigin-RevId: 555445920
2023-08-10 04:46:01 -07:00
Evgenii Kliuchnikov
8c7923045a reduce amount of padding in decoder structs
PiperOrigin-RevId: 555101669
2023-08-09 02:48:53 -07:00
Evgenii Kliuchnikov
a560089843 speedup q5-9 on large files
PiperOrigin-RevId: 553440457
2023-08-03 04:58:52 -07:00
Evgenii Kliuchnikov
0b89871d86 add links to other pages to README.md
PiperOrigin-RevId: 553395376
2023-08-03 01:15:22 -07:00
Evgenii Kliuchnikov
ac2c7bb179 mention used code style
PiperOrigin-RevId: 553095898
2023-08-02 03:47:37 -07:00
Evgenii Kliuchnikov
117b68b745 speedup encoder on q5-9 / 1MB+ files
PiperOrigin-RevId: 553087469
2023-08-02 03:05:57 -07:00
Evgenii Kliuchnikov
4125f2587c update GH actions extensions
PiperOrigin-RevId: 553083944
2023-08-02 02:50:11 -07:00
Evgenii Kliuchnikov
257884a3c5 restore BROTLI_VERSION var in CMake build
PiperOrigin-RevId: 552507047
2023-07-31 09:37:29 -07:00
Evgenii Kliuchnikov
d639a81d35 add option to delete files that are not "compressed"
PiperOrigin-RevId: 552472135
2023-07-31 07:19:14 -07:00
zhongfly
802475e724 fix missing version in CMake build (#1048) 2023-07-31 11:04:46 +02:00
Evgenii Kliuchnikov
27a9a80992 simplify CMake build
PiperOrigin-RevId: 552238545
2023-07-30 03:45:11 -07:00
Evgenii Kliuchnikov
0300be36ba add "repeat" to Java toy decoder
PiperOrigin-RevId: 551770992
2023-07-28 01:06:50 -07:00
Jyrki Alakuijala
4fc753e707 Merge pull request #1045 from google/eustas-update-export
Update .gitattributes
2023-07-27 10:39:17 +02:00
Eugene Kliuchnikov
0b8d3c6107 Update .gitattributes
Update list of exportes files in root directory
2023-07-26 12:56:04 +02:00
Evgenii Kliuchnikov
dbfebd13dc Workaround for GitHub / CodeQL bug
Sometimes GitHub Actions uses bare branch name whereas CodeQL always expects ref.
See https://github.com/github/codeql-action/issues/796

PiperOrigin-RevId: 550504283
2023-07-24 03:30:15 -07:00
Evgenii Kliuchnikov
779a49bfd6 bake in runtime constant
PiperOrigin-RevId: 549590409
2023-07-20 04:18:46 -07:00
Thomas Fischbacher
acc265655d Small Python modernization of Brotli code.
PiperOrigin-RevId: 549289787
2023-07-19 05:44:36 -07:00
Evgenii Kliuchnikov
4b827e4ce4 add CHANGELOG.md
PiperOrigin-RevId: 548971474
2023-07-18 05:24:13 -07:00
Evgenii Kliuchnikov
c3dc7d039c more careful bit-reader interruption
PiperOrigin-RevId: 548661043
2023-07-17 05:39:13 -07:00
Evgenii Kliuchnikov
c2848d5537 add synth test for metadata block
PiperOrigin-RevId: 548120163
2023-07-14 07:26:14 -07:00
Evgenii Kliuchnikov
de52bc7ce0 add "zero cost command" synth test
PiperOrigin-RevId: 548050521
2023-07-14 01:06:00 -07:00
Evgenii Kliuchnikov
d1fadddc94 drop make / automake files
PiperOrigin-RevId: 546866478
2023-07-10 07:31:18 -07:00
Evgenii Kliuchnikov
2d0947f1ea insert missing fuzz/WORKSPACE content
PiperOrigin-RevId: 546848285
2023-07-10 07:31:03 -07:00
Evgenii Kliuchnikov
2e6164d7b0 verbose error report in CLI
PiperOrigin-RevId: 546833411
2023-07-10 11:43:42 +00:00
Evgenii Kliuchnikov
70e7b1ae4a simplify building of fuzzer
PiperOrigin-RevId: 545950923
2023-07-10 11:43:27 +00:00
Evgenii Kliuchnikov
413b098564 Fix integration .pom
PiperOrigin-RevId: 545910020
2023-07-06 08:38:57 +00:00
Evgenii Kliuchnikov
dd3eb162b0 Fix JS tests
PiperOrigin-RevId: 545743271
2023-07-05 19:15:41 +00:00
Evgenii Kliuchnikov
11b8d7cb8a update .pom files
PiperOrigin-RevId: 545659932
2023-07-05 19:15:32 +00:00
Evgenii Kliuchnikov
28257b2e67 refine types in decode.js
PiperOrigin-RevId: 545575363
2023-07-05 19:15:24 +00:00
Evgenii Kliuchnikov
bc32ae12d5 add tests with UTF8/UTF16 non-ASCII text
PiperOrigin-RevId: 545424981
2023-07-05 19:15:11 +00:00
Evgenii Kliuchnikov
6ee96e291d Internal changes
PiperOrigin-RevId: 545262005
2023-07-04 07:55:25 +00:00
Evgenii Kliuchnikov
e252f1fc15 0.5-2.9% decoder speedup
PiperOrigin-RevId: 529412095
2023-07-04 07:55:16 +00:00
Evgenii Kliuchnikov
11abde4c96 Add tests for TS brotli decoder
PiperOrigin-RevId: 527326003
2023-07-04 07:55:06 +00:00
Evgenii Kliuchnikov
efe140adae add brotli.ts
PiperOrigin-RevId: 526966561
2023-07-04 07:54:57 +00:00
Evgenii Kliuchnikov
ffbe112328 JS: stronger typing
PiperOrigin-RevId: 526909255
2023-07-04 07:54:49 +00:00
Evgenii Kliuchnikov
e1f5788fb0 Fix internal buffer reset
PiperOrigin-RevId: 524301253
2023-07-04 07:54:41 +00:00
Evgenii Kliuchnikov
c0a43495ea JS decoder: code combing
PiperOrigin-RevId: 524076677
2023-07-04 07:54:32 +00:00
Evgenii Kliuchnikov
3afc509b84 JS decoder: code combing
PiperOrigin-RevId: 524016775
2023-07-04 07:54:24 +00:00
Evgenii Kliuchnikov
e9c47ed469 JS: use strict equality operators
PiperOrigin-RevId: 523319759
2023-07-04 07:54:16 +00:00
Evgenii Kliuchnikov
e5dba91c38 Add BROTLI_ENABLE_DUMP build option
PiperOrigin-RevId: 520047051
2023-07-04 07:54:07 +00:00
Evgenii Kliuchnikov
745fd08ef2 internal change
PiperOrigin-RevId: 517214701
2023-07-04 07:53:59 +00:00
Evgenii Kliuchnikov
f29c44ed38 Avoid nullptr with zero offset
PiperOrigin-RevId: 516808122
2023-07-04 07:53:51 +00:00
Evgenii Kliuchnikov
cb1ced3a25 speedup decoder by 0.2%-1.2%
PiperOrigin-RevId: 516754779
2023-07-04 07:53:42 +00:00
Evgenii Kliuchnikov
57c36a4f27 1.2-2.3% decoder speedup
PiperOrigin-RevId: 513524040
2023-07-04 07:53:33 +00:00
Evgenii Kliuchnikov
6db17c87f5 0.4-1.5% decoder speedup
PiperOrigin-RevId: 513248503
2023-07-04 07:53:20 +00:00
Eugene Kliuchnikov
6f7f5a163d Improve CodeQL workflow (#1027) 2023-07-03 15:21:44 +02:00
Eugene Kliuchnikov
e07b6148fd Add CodeQL workflow (#1026) 2023-07-03 14:48:33 +02:00
Eugene Kliuchnikov
ec107cf015 Create scorecard.yml
Install OSSF scoreboard
2023-07-03 12:28:07 +02:00
Felix Hanau
534076fa67 Add support for clang-cl compiler (#1021) 2023-07-03 11:43:35 +02:00
Eugene Kliuchnikov
50ebce107f Fix Bazel build (#1024) 2023-06-22 11:29:08 +02:00
Catena cyber
bfa15d4046 fuzz: make target resist allocation failures (#1023)
So that fuzzing can go on with simulated allocation failures
2023-06-22 10:27:21 +02:00
Zhang Na
1d8452b783 Add loongarch64 support (#1022) 2023-06-20 09:44:23 +02:00
Evgenii Kliuchnikov
ed1995b6bd Merge pull request #1005 from sullis:enum-values
PiperOrigin-RevId: 506138469
2023-02-01 09:20:28 +00:00
Evgenii Kliuchnikov
38e9add9d2 Fix permissions
PiperOrigin-RevId: 506096478
2023-02-01 09:20:14 +00:00
Evgenii Kliuchnikov
b2c86d1871 Decoder API: added API to attach metadata blocks callbacks
PiperOrigin-RevId: 505734532
2023-01-31 16:03:16 +00:00
Evgenii Kliuchnikov
04f294b18a Fix emitting 1-byte long metadata block
PiperOrigin-RevId: 505484299
2023-01-30 09:10:28 +00:00
Brotli
1e61e972fb speed up encoding by ~5 %
PiperOrigin-RevId: 505061835
2023-01-30 09:10:14 +00:00
Sean C. Sullivan
2ce0feba3c avoid array allocation in Encoder.Mode enum 2023-01-22 06:30:06 -08:00
Brotli
36533a866e Internal change
PiperOrigin-RevId: 502401179
2023-01-17 13:51:00 +00:00
Aron Parker
71fe6cac06 Fix BrotliEncoderEstimatePeakMemoryUsage (#1002)
Fixes https://github.com/google/brotli/issues/1001
2023-01-07 22:01:47 +01:00
Eugene Kliuchnikov
e3ea91d5c9 Java wrapper: allow using partial byte arrays (#999) 2023-01-04 15:38:17 +01:00
Eugene Kliuchnikov
0ea4603880 Fix MSVC warning (#998)
Fix #875
2023-01-04 12:10:29 +01:00
Eugene Kliuchnikov
ce92c95601 brotlidump: fix dictionary file discovery (#997) 2023-01-03 20:44:14 +01:00
Eugene Kliuchnikov
0ff60731f8 Add security policy (#996) 2023-01-03 18:24:47 +01:00
Eugene Kliuchnikov
81181ecfb6 Cleanup (#995) 2023-01-03 17:18:05 +01:00
Eugene Kliuchnikov
a2cc451df2 Add win release assets (#994)
Fix #983
2023-01-03 17:16:17 +01:00
Ma Lin
c8df4b3049 Python: use a new output buffer code (#902)
Currently, the output buffer is a std::vector<uint8_t>.
When the buffer grows, resizing will cause unnecessary memcpy().

This change uses a list of bytes object to represent output buffer, can avoid the extra overhead of resizing.
In addition, C++ code can be removed, it's a pure C extension.
2022-12-29 14:07:16 +01:00
Eugene Kliuchnikov
509d4419bd Copy ns time stat (#992) 2022-12-22 16:05:25 +01:00
Eugene Kliuchnikov
81dc1c86c3 Ramp up CMake to v3 (#991)
Drive-by: drop premake5 support
2022-12-22 12:15:55 +01:00
Jack
a7b7839fd4 Add *.d to gitignore (#975) 2022-12-21 09:52:31 +01:00
Eugene Kliuchnikov
3152d995b3 Replace deprecated win-2016 workflows (#990)
* Remplace deprecated win-2016 workflows

* Update action/checkout to v3
2022-12-20 17:35:26 +01:00
Eugene Kliuchnikov
c48ebca4a8 Fix bazel build (#989) 2022-12-20 12:25:26 +01:00
Kleis Auke Wolthuizen
9b53703237 CMake: ensure static libraries are still installed on Emscripten (#988)
Similar to commit ce222e317e.
2022-12-20 11:03:21 +01:00
Adrian Perez
641bec0e30 CMake: Allow using BUILD_SHARED_LIBS to choose static/shared libs (#655)
By convention projects using CMake which can build either static or
shared libraries use a BUILD_SHARED_LIBS flag to allow selecting between
both: the add_library() command automatically switches between both using
this variable when the library kind is not passed to add_library(). It
is also usual to expose the BUILD_SHARED_LIBS as an user-facing setting
with the option() command.

This way, the following will both work as expected:

   % cmake -DBUILD_SHARED_LIBS=OFF ...
   % cmake -DBUILS_SHARED_LIBS=ON ...

This is helpful for distributions which need (or want) to build only
static libraries.
2022-12-16 11:42:42 +01:00
Aayush Atharva
3914999fcc Fix typo (#951) 2022-11-17 14:49:55 +01:00
Lukas Oberhuber
f842c1bcf9 fix macos rpath (#976)
Without this patch, the three libraries are not provided with valid
rpaths, meaning they are not packaged correctly for macos.

c.f. https://github.com/google/brotli/issues/934
(which is a similar issue) but should be fixed by this fix as well.

Also https://gitlab.gnome.org/Infrastructure/gimp-macos-build/-/merge_requests/129
2022-11-17 14:37:20 +01:00
Michal Josef Špaček
ae212a792e Fix bootstrap version computing with custom bc (#978)
When i have ~/.bc configuration file with content:
scale=2
which is changing default behaviour (scale=0), bootstrap is not working.
2022-11-17 14:31:35 +01:00
Evgenii Kliuchnikov
a8f5813b84 Update
Documentation:
  - add note that brotli is a "stream" format, not an archive-like
  - regenerate .1 with Pandoc
Build:
  - drop legacy "BROTLI_BUILD_PORTABLE" option
  - drop "BROTLI_SANITIZED" definition
Code:
  - c: comb includes
  - c/enc: extract encoder state into separate header
  - c/enc: drop designated q10 codepath
  - c/enc: dealing better with flushing of empty stream
  - fix MSVC compilation
API:
  - py: use library version instead of one in version.h
  - c: add plugable API to report consumed input / produced output
  - c/java: support "lean" prepared dictionaries (without copy of source)
2022-11-17 13:03:09 +00:00
清靈語
388d0d53fb add pyproject.toml (#987)
* add pyproject.toml 

pypa/pip#8559

https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/#fallback-behaviour

* modify requirements

https://github.com/google/brotli/pull/987#issuecomment-1315486841
2022-11-15 18:39:13 +01:00
Keith Smiley
6d03dfbedd Fix -Wstrict-prototypes warnings (#985)
Envoy builds brotli with -Werror, and these strict prototypes are picked
up by newer versions of clang.
2022-10-25 21:29:58 +02:00
Anonymous Maarten
9801a2c5d6 Wrap interface include directories with BUILD_INTERFACE generator expression (#966)
* Wrap interface include directories with BUILD_INTERFACE generator expression

When exporting a CMake target using install(TARGETS) + install(EXPORT),
CMake requires all include directories to be clean of build system
directories.

https://cmake.org/cmake/help/latest/prop_tgt/INTERFACE_INCLUDE_DIRECTORIES.html

This change also allows use of brotli as a CMake subproject and
installing + exporting it.

* Fix typo in generator expression
2022-05-12 10:50:48 +03:00
Ryan Schmidt
f09b2555ac bootstrap: Fix exit code when autoreconf fails (#962)
Fixes:

./bootstrap: line 37: exit: $: numeric argument required
2022-05-11 19:21:00 +03:00
Ryan Schmidt
c9eb85691f Fix bootstrap on macOS (#965)
* bootstrap: Verify functionality of sed

Check for the existence of sed by running a simple substitution rather
than using the --version flag. This lets us remove the weird exclusion
of FreeBSD from checking the sed requirement, and fixes checking the sed
requirement on other systems like macOS that use BSD sed, which doesn't
support --version.

* bootstrap: Detect flag for sed extended RE

Detect whether sed needs -E or -r to enable extended regular
expressions. Fixes bootstrap on macOS, whose BSD sed does not support
-r.

GNU sed has supported -E as a synonym for -r since version 4.2 (2009),
initially as an undocumented option for compatibility with BSD sed:

http://git.savannah.gnu.org/cgit/sed.git/commit/sed/sed.c?id=3a8e165ab02487c372df217c1989e287625ce0ae

and later as a documented option after -E became POSIX:

http://git.savannah.gnu.org/cgit/sed.git/commit/sed/sed.c?id=8b65e07904384b529a464c89f3739d2e7e4d5135
2022-05-11 19:20:39 +03:00
Marco Scardovi
f4153a09f8 Fix for future versions of python (#911)
Starting python 3.10, the use of - instead of _ will get a warn (see https://bugs.gentoo.org/796281 for reference)

Signed-off-by: Marco Scardovi <marco@scardovi.com>
2022-01-10 13:08:10 +03:00
Mohammad Bahoosh
e83c7b8e8f Supress cmake warning (#931)
Not providing VERSION to "project" command will cause a warning.

Since this project's version is loaded from other files, this policy will help suppress the warning generated by cmake.
This policy is set because we can't provide "VERSION" in "project" command.
Use `cmake --help-policy CMP0048` for more information
2021-12-15 13:28:25 +03:00
Jyrki Alakuijala
4ec67035c0 Merge pull request #929 from jbms/fix-vla-parameter
Fix -Werror=vla-parameter errors with GCC 11.2.0
2021-12-07 01:47:07 +01:00
Eugene Kliuchnikov
8376f72ed6 Prepare for copybara (#939)
Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-11-10 10:34:39 +01:00
Jeremy Maitin-Shepard
27dd726540 Fix -Werror=vla-parameter errors with GCC 11.2.0 2021-09-14 12:27:45 -07:00
Eugene Kliuchnikov
62662f87cd Strip "./" in includes (#925)
Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-09-08 09:18:45 +02:00
Eugene Kliuchnikov
698e3a7f9d Update README.md
Fix typo in Gihtub actions badge
2021-08-31 15:24:35 +02:00
Eugene Kliuchnikov
a10269cea1 Update README.md (#923) 2021-08-31 15:22:23 +02:00
Eugene Kliuchnikov
0e42caf359 Migrate to github actions (#920)
Not all combinations are migrated to the initial configuration; corresponding TODOs added.

Drive-by: additional combinations uncovered minor portability problems -> fixed
Drive-by: remove no-longer used "script" files.

Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-08-31 14:07:17 +02:00
Eugene Kliuchnikov
68f1b90ad0 Update (#918)
Prepare to use copybara worklow.
2021-08-18 19:15:07 +02:00
Eugene Kliuchnikov
19d86fb9a6 Merge-in SharedDictionary feature (#916)
Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2021-08-04 14:42:02 +02:00
Eugene Kliuchnikov
630b5084ee Update (#914)
* slimmer stack frames in encoder
 * fix MSAN problem in hasher_composite
   (not dangerous, only in large_window mode)
 * fix JNI decoder wrapper - power-of-two payloads fail to decode sometimes
 * reformat polyfil.js and decode_test.js
2021-07-29 22:29:43 +02:00
Dirk Lemstra
ce222e317e Enabled install when building with emscripten. (#906)
* Enabled install when building with emscripten.

* Also install the pkg-config files.
2021-06-23 10:12:21 +02:00
Adrián Herrera Arcila
0a3944c8c9 Fix VLA parameter warning (#893)
Make VLA buffer types consistent in declarations and definitions.
Resolves build crash when using -Werror due to "vla-parameter" warning.

Signed-off-by: Adrian Herrera <adr.her.arc.95@gmail.com>
2021-06-23 09:53:59 +02:00
Ikko Ashimine
bdcfb123c8 Fix typo in hash_composite_inc.h (#903)
defered -> deferred
2021-06-23 09:42:28 +02:00
Eugene Kliuchnikov
f8c6717745 Update (#908)
* re-enable Js build/test
  * improve decoder performance
  * rewrite dictionary data in Java/Js to a shorter uncompressed form
  * improve dictionary generation tool
2021-06-23 09:40:57 +02:00
Martin Grigorov
bbe5d72ba3 [Java] make it possible to set modes (generic, text, font) (#887)
* [Java] make it possible to set modes (generic, text, font)
2021-03-24 21:23:03 +01:00
Eugene Kliuchnikov
2f9277ff2f Update bazel WORKSPACE files (#896)
* Update bazel WORKSPACE files

* Use fresh OSX image

* Cache homebrew dirs for faster startup
2021-03-24 15:05:23 +01:00
Christian Clauss
63be8a9940 unichr was removed in Python 3 because all str are Unicode (#877)
https://python-future.org/compatible_idioms.html#unichr
2021-01-27 15:08:05 +01:00
marianopeck
2a51a85aa8 New Dart fast FFI-based Brotli implementation (#866)
New Dart compression framework with [fast FFI-based Brotli implementation](https://pub.dev/documentation/es_compression/latest/brotli/brotli-library.html) with ready-to-use prebuilt binaries for Win/Linux/Mac
2021-01-18 11:59:02 +01:00
Eugene Kliuchnikov
5692e422da Update (#852)
* Update

 * comments and clarifications in block_splitter
 * power-of-2 aligned allocations for Hasher
 * refresh decode.js from Java sources
 * disable JS build
2021-01-18 10:56:39 +01:00
Aayush Atharva
f16845614d Fix typo in variable name (#854)
* Fix typo in variable name

* Fix compile error
2021-01-08 13:24:44 +01:00
Juliy V. Chirkov
0e8afdc968 typo fix (#868) 2021-01-08 13:21:44 +01:00
dependabot[bot]
4969984a95 Bump junit from 4.12 to 4.13.1 in /java/org/brotli/dec (#853)
Bumps [junit](https://github.com/junit-team/junit4) from 4.12 to 4.13.1.
- [Release notes](https://github.com/junit-team/junit4/releases)
- [Changelog](https://github.com/junit-team/junit4/blob/main/doc/ReleaseNotes4.12.md)
- [Commits](https://github.com/junit-team/junit4/compare/r4.12...r4.13.1)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-01-08 13:21:02 +01:00
Evgenii Kliuchnikov
fcda9db7fd Shorten docs/brotli.svg
Kudos to @alrra
2020-10-08 14:50:33 +02:00
Tim Gates
685d7baea9 docs: Fix small typo: rougly -> roughly (#849) 2020-09-27 11:00:29 +02:00
Gábor Lipták
60b2a7ada5 Add Python 3.7 and 3.8 to Travis (#847) 2020-09-25 19:37:31 +02:00
Eugene Kliuchnikov
f6b3aa6d0f Add brotli logo (#845)
Co-authored-by: Eugene Kliuchnikov <eustas@chromium.org>
2020-09-24 13:43:44 +02:00
Kurt Mosiejczuk
f2ca32eda6 Change MANIFEST.in to include python regression tests in tarball (#841)
* Change MANIFEST.in to include python regression tests in tarball

* Python tests need the testdata from the base tests directory
2020-09-21 13:24:13 +02:00
Gabriel
97006561ea Fix VC C++ 12.0 BROTLI_MSVC_VERSION_CHECK calls (#843) 2020-09-21 13:22:56 +02:00
Dmitry Rozhkov
0cd2e3926e Fix MSVC linker error (#840)
The -lm linker option is not known to MSVC and setting it triggers
errors in some build systems:

  [6,366 / 6,367] Linking source/exe/envoy-static.exe; 11s remote
  LINK : warning LNK4044: unrecognized option '/lm'; ignored
  LINK : error LNK1218: warning treated as error; no output file generated
  ERROR: C:/source/source/exe/BUILD:22:16: Linking of rule
    '//source/exe:envoy-static' failed (Exit 4044): link.exe failed: error
    executing command

Do not set -lm in case of MSVC.
2020-09-08 16:48:31 +02:00
Pavel Rosický
90fd2b60cc add execution time (#834) 2020-09-07 10:53:03 +02:00
Dmitry Rozhkov
7e8e207ce2 Fix clang-10 compilation issue (#839)
clang-10 throws the following error:
In file included from external/org_brotli/c/enc/bit_cost.c:9:
external/org_brotli/c/enc/./bit_cost.h:48:16: error: implicit conversion
from 'size_t' (aka 'unsigned long') to 'double' may lose precision
[-Werror,-Wimplicit-int-float-conversion]
  if (retval < sum) {
             ~ ^~~
1 error generated.

Make the conversion explicit.
2020-09-07 09:40:03 +02:00
Eugene Kliuchnikov
09b0992b6a Revert "Add runtime linker path to pkg-config files (#740)" (#838)
This reverts commit 31754d4ffc.
2020-09-02 11:38:26 +02:00
Evgenii Kliuchnikov
0545759b2e Address issues noted in #833 2020-08-28 10:14:08 +02:00
Evgenii Kliuchnikov
e61745a6b7 Re-release 2020-08-27 16:12:55 +02:00
Evgenii Kliuchnikov
db361a0bb9 Re-add python bindings to sources tarball 2020-08-27 16:01:44 +02:00
Eugene Kliuchnikov
d518e55ba7 Update README.md 2020-08-26 18:46:19 +02:00
Eugene Kliuchnikov
d052918255 Fix build files (#829) 2020-08-26 17:13:31 +02:00
Eugene Kliuchnikov
665e81dc9b New version: 1.0.8 (#827) 2020-08-26 14:36:02 +02:00
Eugene Kliuchnikov
223d80cfbe Update (#826)
* IMPORTANT: decoder: fix potential overflow when input chunk is >2GiB
 * simplify max Huffman table size calculation
 * eliminate symbol duplicates (static arrays in .h files)
 * minor combing in research/ code
2020-08-26 12:32:27 +02:00
Eugene Kliuchnikov
0c5603e07b Fix output parameter type for _BitScanReverse (#819)
Fix #811
2020-07-09 16:40:01 +02:00
Eugene Kliuchnikov
e8155d67b0 CMake: change default ("") build type to Release (#818)
Fix #817
2020-07-09 15:35:57 +02:00
Eugene Kliuchnikov
fc823290a7 Mute strerror/strcpy warnings is MSVC build. (#815) 2020-07-02 19:45:57 +02:00
Eugene Kliuchnikov
5519352661 Add workaround for lying feof. (#814)
Should fix #812
2020-07-02 17:57:40 +02:00
fisherwky
d2ea198232 Update platform.h (#813)
fix compile error (platform.h:362: error: cast discards qualifiers from pointer target type)
2020-06-30 11:23:07 +02:00
Nils Goroll
31754d4ffc Add runtime linker path to pkg-config files (#740)
Otherwise libraries will not be found at runtime when installing to a
path not included in the default runtime linker's path with programs
linking brotli configured via pkg-config.
2020-05-15 13:11:01 +02:00
OZone
8f093f5e84 .gitignore: Ignore .obj files (#805)
EDK II windows build produces .obj files in source tree
2020-05-15 13:05:03 +02:00
Eugene Kliuchnikov
f0db711f46 Filter sources for the tarball. (#808) 2020-05-15 13:04:17 +02:00
Eugene Kliuchnikov
7f740f1308 Update (#807)
- fix formatting
 - fix type conversion
 - fix no-op arithmetic with null-pointer
 - improve performance of hash_longest_match64
 - go: detect read after close
 - java decoder: support compound dictionary
 - remove executable flag on non-scripts
2020-05-15 11:06:21 +02:00
Eugene Kliuchnikov
f83aa5169e Update bazel to 2.2 + update config (#798)
Newer bazel does not support `maven_jar` rule anymore...
2020-03-31 14:38:01 +02:00
Clinton Ingram
924b2b2b9d Move TZCNT and BSR intrinsics to platform.h, add MSVC versions (#636) 2020-03-19 11:57:56 +01:00
Paul Vollmer
0503d8b766 Added go.mod file to go/cbrotli directory (#754)
* Added go.mod file

* go.mod removed go version
2020-03-19 11:54:51 +01:00
Cristi Vîjdea
f503cb709c Add HAVE_LOG2 build macro (#783)
* Add HAVE_LOG2 build macro

Fixes #781

* Rename macro to BROTLI_HAVE_LOG2 and move comment for visibility
2020-03-19 10:46:52 +01:00
Leo Neat
36ac0feaf9 Adding CIFuzz (#797) 2020-03-19 09:52:07 +01:00
shenglei10
666c3280cc Make types of variable match (#796) 2020-02-14 10:40:02 +01:00
agrieve
4b5771bee7 Add missing "const" to a couple of kConstants (#780)
These showed up in a Chromium audit:
https://bugs.chromium.org/p/chromium/issues/detail?id=747064#c8

Although already effectively const, adding "const" causes the symbols to
be moved into the read-only section of the binary.
2019-12-20 00:15:58 +01:00
Griffin Downs
c435f06675 Add vcpkg installation instructions (#776) 2019-10-01 22:53:10 +02:00
James Hilliard
5c3a9a937b Fix license in setup.py (#769) 2019-08-16 16:32:14 +02:00
Ammar Askar
afc4a74273 Add oss-fuzz fuzzing status badge to README (#767) 2019-08-13 15:49:30 +02:00
Eugene Kliuchnikov
35ef5c554d Disable PIC in EMCC mode. (#768) 2019-08-13 15:23:04 +02:00
Eugene Kliuchnikov
ca21dac8e5 Add an option to avoid building shared libraries. (#766)
Add an option to avoid building shared libraries (for building with EMCC)

Drive-by:
* maven: ramp up java level to minimal required
* travis: replace deprecated clang-5.0 with clang-7
* maven: fallback to jdk10 to void javadoc bug
2019-08-07 10:51:55 +02:00
Eugene Kliuchnikov
3d1767186d Fix include for EMCC build (#765) 2019-07-30 10:01:21 +02:00
Eugene Kliuchnikov
f1124c8524 More careful sanitizer detection (#764) 2019-07-22 14:29:51 +02:00
Eugene Kliuchnikov
c8b37e8fd1 Update (#762)
* put LICENSE file into .jar
 * fix typo
 * add clarification comment in PY wrapper
2019-07-17 14:39:56 +02:00
Eugene Kliuchnikov
40f0fdcdc1 Explicitly mark tests/testdata/* as binary. (#761)
Fixes #760

Drive-by:
 * update go_rules
 * modernize brotli_inc
 * fix wrapper build
 * update PY to 3 in Travis / OSX / Bazel build
 * upgrade JS Bazel rules.
2019-07-16 17:49:14 +02:00
Eugene Kliuchnikov
78e7bbc3c3 Update (#753)
* fix executable mode of decode.js
 * explain clang-analyser about non-nullability
 * fix "dead assignment"
 * rename proguard.cfg -> proguard.pgcfg
2019-05-03 11:51:11 +02:00
Eugene Kliuchnikov
4b2b2d4f83 Update (#749)
Update:

 * Bazel: fix MSVC configuration
 * C: common: extended documentation and helpers around distance codes
 * C: common: enable BROTLI_DCHECK in "debug" builds
 * C: common: fix implicit trailing zero in `kPrefixSuffix`
 * C: dec: fix possible bit reader discharge for "large-window" mode
 * C: dec: simplify distance decoding via lookup table
 * C: dec: reuse decoder state members memory via union with lookup table
 * C: dec: add decoder state diagram
 * C: enc: clarify access to static dictionary
 * C: enc: improve static dictionary hash
 * C: enc: add "stream offset" parameter for parallel encoding
 * C: enc: reorganize hasher; now Q2-Q3 require exactly 256KiB
           to avoid global TCMalloc lock
 * C: enc: fix rare access to uninitialized data in ring-buffer
 * C: enc: reorganize logging / checks in `write_bits.h`
 * Java: dec: add "large-window" support
 * Java: dec: improve speed
 * Java: dec: debug and 32-bit mode are now activated via system properties
 * Java: dec: demystify some state variables (use better names)
 * Dictionary generator: add single input mode
 * Java: dec: modernize tests
 * Bazel: js: pick working commit for closure rules
2019-04-12 13:57:42 +02:00
Eugene Kliuchnikov
9cd01c0437 Update WORKSPACE files. (#742) 2019-02-19 11:14:20 +01:00
Eugene Kliuchnikov
8109882ecf Fix #741 2019-02-18 11:31:48 +01:00
Justin Ridgewell
5805f99a53 Ensure decompression consumes all input (#730)
* Ensure decompression consumes all input

If not, it's a corrupt stream.

* Use byte strings
2018-11-12 10:36:00 +01:00
Eugene Kliuchnikov
d0ffe60b87 Verbose CLI + start pulling "Shared-Brotli" (#722)
* Verbose CLI + start pulling "Shared-Brotli"

 * vesbose CLI output; fix #666
 * pull `SHIFT` transforms; currently this is semantically dead code;
   later it will be used by "Shared-Brotli"
2018-10-24 16:06:09 +02:00
Eugene Kliuchnikov
d6d98957ca Ramp up version to 1.0.7 2018-10-23 12:24:40 +02:00
Eugene Kliuchnikov
a1e44975a7 Fix #698 2018-10-19 17:01:54 +02:00
Eugene Kliuchnikov
a799e34c7f Remove dependency to full JDK. This should speedup clean builds. (#719)
* Remove dependency to full JDK. This should speedup clean builds.

* Upgrade appveyor bazel
2018-10-18 17:25:05 +02:00
Stephen Kyle
7a153ebb09 make/build: ensure NEON is enabled and tested (#718)
Make sure the travis CI aarch32 bot tests NEON, and also that running
CROSS_COMPILE=arm-linux-gnueabihf make enables the use of NEON to
accelerate the back-reference copying.
2018-10-17 17:29:32 +02:00
Eugene Kliuchnikov
ce8951c3e9 Fix <arm_neon.h> inclusion guard. (#717) 2018-10-16 17:19:37 +02:00
Eugene Kliuchnikov
f7cbc97c96 Fix typo / minor formatting (#716)
* Fix typo / minor formatting / pull computable constant to the place of use.
2018-10-16 16:46:54 +02:00
Stephen Kyle
cc7a74f15f decode: fix NEON inclusion (#714)
The macro that checks for NEON support should be __ARM_NEON, not
__ARM_NEON__. [1]

AArch64 compilers define __ARM_NEON but not __ARM_NEON__.
AArch32 compilers currently seem to define both, but could be within their
rights to drop __ARM_NEON__ in future versions.

This change moves the check into the common/platform.h file, checks for
both forms, and sets BROTLI_TARGET_NEON if NEON support is available.

[1] Section 6.5.4 of the ARM C Language Extensions.
    (At the time of writing, the latest version was Release 2.1.)
2018-10-08 15:40:11 +02:00
Alexey Ivanov
c94c6f805c tools/brotli: improve window size autodetect (#710)
Window size is defined as:
    `(1 << BROTLI_PARAM_LGWIN) - 16`
in `c/include/brotli/encode.h`

Therefore we should probably take these 16 bytes into account.

Done basic manual testing:
$ python3 -c 'print ("A"*2046)' > t
$ bazel run -- //:brotli -w 0 -f -o $(realpath t).br $(realpath ./t)
$ python3 research/brotlidump.py t.br |& fgrep WSIZE
0000  c1                1000001 WSIZE   windowsize=(1<<12)-16=4080

New version properly detects window size of `4080`, while previous one used `2032`:
$ python3 research/brotlidump.py t.br |& fgrep WSIZE
0000  b1                0110001 WSIZE   windowsize=(1<<11)-16=2032
2018-10-02 16:28:37 +02:00
Stephen Kyle
9402ac5c08 decode: faster huffman code loading on 32-bit Arm (#703)
* platform: add macro for using the 'aligned' attribute

* decode: add accessor macros for HuffmanCode fields

Adds a constructor function for building HuffmanCode values
so they can be accessed quickly on different architectures.

Also adds macros for marking a HuffmanCode table pointer
that can be accessed quickly (BROTLI_HC_MARK_TABLE_FOR_FAST_LOAD),
adjusting the index into that table (BROTLI_HC_ADJUST_TABLE_INDEX),
and getting the .bits or .value fields out of the table at the
current index (BROTLI_HC_GET_BITS/VALUE).

For example, assuming |table| contains a HuffmanCode pointer:

  BROTLI_HC_MARK_TABLE_FOR_FAST_LOAD(table);
  BROTLI_HC_ADJUST_TABLE_INDEX(table, index_into_table);
  *bits = BROTLI_HC_GET_BITS(table);
  *value = BROTLI_HC_GET_VALUE(table);
  BROTLI_HC_ADJUST_TABLE_INDEX(table, offset);
  *bits2 = BROTLI_HC_GET_BITS(table);
  *value2 = BROTLI_HC_GET_VALUE(table);

All uses of the HuffmanCode have been updated appropriately.

* decode: add alternative accessors for HuffmanCode on Arm AArch32
2018-09-27 13:15:46 +02:00
Stephen Kyle
67f059eaf5 Cross compilation support (#709)
* build: add cross-compilation support to make

Set CROSS_COMPILE when running make to use the selected cross
compilation toolchain, such as arm-linux-gnueabihf, or
aarch64-linux-gnu.

Testing requires the presence of qemu - 'qemu-$(ARCH)' will be executed,
where ARCH is the first part of the toolchain triplet.

* build: add cross-compilation support to cmake

If C_COMPILER/CXX_COMPILER/CC/CXX are found to have cross-compilation
triplets in front of the compiler, then qemu will be used to execute the
tests.

* CI: add arm-linux-gnueabihf-gcc builder to Travis

The version of qemu available in Ubuntu trusty (as provided by Travis)
appears to have a bug in qemu-aarch64, which leads to the compatibility
tests failing on some inputs, erroneously rejecting the input as
corrupt.

Once Travis supports xenial, we could add an aarch64-gnu-linux-gcc
builder as well.

* CI: propagate cmake errors out of .travis.sh

Seems like even if cmake fails, the error isn't picked up by Travis.
2018-09-27 11:00:33 +02:00
Jørgen Ibsen
6eba239a5b Fix auto detect of bundled mode (#704)
Set bundled mode to ON when parent directory is not empty. Due to the
peculiarities of CMake if, comparing an undefined variable to the empty
string is false, so this likely never triggered.
2018-09-13 13:31:23 -04:00
Eugene Kliuchnikov
2216a0dd63 Update (#706)
Update
 * add ASAN/MSAN unaligned read specializations
 * add "brotli" prefix to u_uint64 type
 * increment version to 1.0.06
 * fix CoverityScan "unused assignment" warning
 * fix JDK 8<->9 incompatibility
 * add encoder optimization for empty input
 * regenerate JS decoder
 * unbreak Travis builds
2018-09-13 08:09:32 -04:00
Stephen Kyle
d4cd6cdf1c platform: fix unaligned 64-bit accesses on AArch32 (#702)
Ensures that Aarch32 Arm builds with an Armv8 compiler do not set
BROTLI_64_BITS.

This scenario is possible with ChromeOS builds, as they may use a
toolchain with the target armv7-cros-gnueabi, but with -march=armv8.
This will set __ARM_ARCH to 8 (defining BROTLI_TARGET_ARMV8), but will
also set __ARM_32BIT_STATE and not __ARM_64BIT_STATE. Without this,
illegal 64-bit non-word-aligned reads (LDRD) may be emitted.

Also fix unaligned 64-bit reads on AArch32 - STRD was still possible to
emit.
2018-07-25 11:43:06 +02:00
Eugene Kliuchnikov
8a073bd9e2 Revert "platform: fix unaligned 64-bit accesses on AArch32 (#699)" (#701)
This reverts commit 6d027d1648.
2018-07-24 17:32:13 +02:00
Stephen Kyle
6d027d1648 platform: fix unaligned 64-bit accesses on AArch32 (#699)
Ensures that Aarch32 Arm builds with an Armv8 compiler do not set
BROTLI_64_BITS.

This scenario is possible with ChromeOS builds, as they may use a
toolchain with the target armv7-cros-gnueabi, but with -march=armv8.
This will set __ARM_ARCH to 8 (defining BROTLI_TARGET_ARMV8), but will
also set __ARM_32BIT_STATE and not __ARM_64BIT_STATE. Without this,
illegal 64-bit non-word-aligned reads (LDRD) may be emitted.

Also fix unaligned 64-bit reads on AArch32 - STRD was still possible to
emit.
2018-07-24 17:29:50 +02:00
William A. Kennington III
fc4d345968 Fix missing header files (#695)
Our dist tarball is missing hash_rolling_inc.h and
hash_composite_inc.h, which causes subsequent autotools
builds to fail. Fix this by adding it to the sources list.

Signed-off-by: William A. Kennington III <william@wkennington.com>
2018-07-09 10:40:08 +02:00
Eugene Kliuchnikov
b601fe817b Ramp up version to 1.0.5 2018-06-27 17:03:45 +02:00
Cody Schroeder
ee2a5e1540 Update go_library to use standard importpath (#690)
* Update go_library to use standard importpath

Instead of using go_prefix, which is deprecated, the importpath attribute is made explicit.

* Add description to go/BUILD
2018-06-26 18:08:07 +02:00
Eugene Kliuchnikov
eb12ec04eb Update (#688)
* add rolling-composite-hasher for large-window mode
* make API methods explicitly public
2018-06-20 15:14:10 +02:00
Eugene Kliuchnikov
7505290ef9 Convert fuzzer to C99. (#686) 2018-06-18 14:39:38 +02:00
Eugene Kliuchnikov
ff05c35166 Add VS2017 release Appveyor build (#685) 2018-06-18 13:13:23 +02:00
Eugene Kliuchnikov
09cd3e877f Update 2018-06-11 15:17:26 +02:00
Eugene Kliuchnikov
8544ae858d Update (#680)
* fix MSVC warnings
 * cleanups
2018-06-09 11:17:13 +02:00
Eugene Kliuchnikov
1e7ea1d8e6 Inverse bazel project/workspace tree (#677)
* Inverse bazel workspace tree.

Now each subproject directly depends on root (c) project.

This helps to mitigate Bazel bug bazelbuild/bazel#2391; short summary:
Bazel does not work if referenced subproject `WORKSPACE` uses any
repositories that embedding project does not.

Bright side: building C project is much faster;
no need to download closure, go and JDK...
2018-06-04 17:53:16 +02:00
Eugene Kliuchnikov
29dc2cce90 Update golang and JS Bazel plugins to latest stable versions. (#676) 2018-05-31 13:21:04 +02:00
davidlt
f9b8c02673 Add RISC-V 64-bit (riscv64) platform configuration (#669)
Signed-off-by: David Abdurachmanov <david.abdurachmanov@gmail.com>
2018-05-22 14:35:04 +02:00
Eugene Kliuchnikov
48a25b3fa4 Fix #671 (#672) 2018-05-18 22:07:52 +02:00
Eugene Kliuchnikov
a4581c158e Add tools to download and transform static dictionary data. (#670) 2018-05-16 12:59:09 +02:00
Eugene Kliuchnikov
f5ed35d065 Update (#664)
* Update
 * fix ifdef style
 * get back to fine-compiler-version-based-macros (use Hedley)
 * fix q=0 histogram collection for very long copy/insert commands
2018-05-03 11:16:21 +02:00
Cosimo Lupo
f94cd51b5c appveyor: fix issue self-upgradig pip to v10 (#663)
Installing with --user will leave the old pip.exe script in the $PATH,
but running this will fail because pip 10 moved 'main' to internal
modules.

https://github.com/pypa/pip/issues/5240#issuecomment-382989420
2018-04-20 19:31:51 +02:00
Eugene Kliuchnikov
6000396155 Remove unprefixed macros from public headers (#662) 2018-04-20 14:10:55 +02:00
Eugene Kliuchnikov
68db5c0272 Update (#660)
* Update
 * improve q=1 compression on small files
 * fix "left shift before promotion"
 * fix osx Travis builds
2018-04-13 11:44:34 +02:00
Eugene Kliuchnikov
c6333e1e79 Fix MSVC compilation (#657)
* tell bazel not to pass strict options to a fancy compiler
 * fix signed-unsigned comparison warning found by MSVC
2018-03-29 10:37:07 +02:00
Eugene Kliuchnikov
0f3c84e745 Update (#656)
* proper fix for the "fall through" warning"
 * automatic NDIRECT/NPOSTFIX tuning (better compression)
 * fix unaligned access for `aarch64`-cross-`armhf` build
 * fix `aarch64` detection (10% decoder speedup)
 * expose `large_window` CLI option
 * make default window size 16MiB
 * ramp up version to 1.0.4
2018-03-27 22:29:22 +02:00
Adrian Perez
515fc62313 Tell CMake to not check for a C++ compiler (#653)
By default CMake checks both for C and C++ compilers, while the latter
is not needed. Setting the list of languages to just "C" in the call to
project() removes the unneeded check.
2018-03-26 21:41:18 +06:00
Eugene Kliuchnikov
2c03482569 Fix "memory leak" in python tests (#652)
OOMs on RPi (1GB)
2018-03-23 02:09:00 +06:00
Tobe O
a238f5bac9 Update README.md (#646)
Add mention of Dart native bindings
2018-03-20 17:53:32 +06:00
Eugene Kliuchnikov
631fe194a1 Update (#651)
* fix `bazel` build (ignore switch case fall-through)
* add `NPOSTFIX` / `NDIRECT` encoder parameters
* fix source file lists (add `params.h`)
* fix bug in `durchschlag`
* print clarifying messages wheb CLI argument parsing fails
2018-03-20 17:37:41 +06:00
Eugene Kliuchnikov
533843e354 Update (#643)
Update
 * make the zopflification aware of `NDIRECT`, `NPOSTFIX`
   (better compression in `font` mode)
 * add small and simple decoder tool
 * fix typo
 * Java: wrapper: make decoder channel more async-friendly

Ramp up version to 1.0.3 / 1.0.3
2018-03-02 15:49:58 +01:00
Eugene Kliuchnikov
35e69fc7cf New feature: "Large Window Brotli" (#640)
* New feature: "Large Window Brotli"

By setting special encoder/decoder flag it is now possible to extend
LZ-window up to 30 bits; though produced stream will not be RFC7932
compliant.

Added new dictionary generator - "DSH". It combines speed of "Sieve"
and quality of "DM". Plus utilities to prepare train corpora
(remove unique strings).

Improved compression ratio: now two sub-blocks could be stitched:
the last copy command could be extended to span the next sub-block.

Fixed compression ineffectiveness caused by floating numbers rounding and
wrong cost heuristic.

Other C changes:
 - combined / moved `context.h` to `common`
 - moved transforms to `common`
 - unified some aspects of code formatting
 - added an abstraction for encoder (static) dictionary
 - moved default allocator/deallocator functions to `common`

brotli CLI:
 - window size is auto-adjusted if not specified explicitly

Java:
 - added "eager" decoding both to JNI wrapper and pure decoder
 - huge speed-up of `DictionaryData` initialization

* Add dictionaryless compressed dictionary

* Fix `sources.lst`

* Fix `sources.lst` and add a note that `libtool` is also required.

* Update setup.py

* Fix `EagerStreamTest`

* Fix BUILD file

* Add missing `libdivsufsort` dependency

* Fix "unused parameter" warning.
2018-02-26 09:04:36 -05:00
Eugene Kliuchnikov
3af18990f5 Update go and closure bazel rules (#637)
* Update go and closure bazel rules
* Follow the new bazel go rules guide
* Swap go & closure rules initialization
* Update bazel to 0.10.0 in appveyor build
2018-02-08 14:38:10 +01:00
Daniel Chýlek
b5033d0e1e Fix brotlidump.py crashing when complex prefix code has exactly 1 non-zero code length (#635)
According to the format specification regarding complex prefix codes:

> If there are at least two non-zero code lengths, any trailing zero
> code lengths are omitted, i.e., the last code length in the
> sequence must be non-zero.  In this case, the sum of (32 >> code
> length) over all the non-zero code lengths must equal to 32.

> If the lengths have been read for the entire code length alphabet
> and there was only one non-zero code length, then the prefix code
> has one symbol whose code has zero length.

The script does not handle a case where there is just 1 non-zero code
length where the sum rule doesn't apply, which causes a StopIteration
exception when it attempts to read past the list boundaries.

An example of such file is tests/testdata/mapsdatazrh.compressed. I made
sure this change doesn't break anything by processing all *.compressed
files from the testdata folder with no thrown exceptions.
2018-02-08 12:48:24 +01:00
Eugene Kliuchnikov
da254cffdb Update (#630)
* merge {dec|enc}/port.h into common/platform.h
 * fix one-shot q=10 1-byte input compression
 * fix some unprefixed definitions
 * make hashers host-endianness-independent
 * extract enc/params.h from enc/quality.h
 * fix API documentation / typos
 * improve `BrotliEncoderMaxCompressedSize`
2017-12-12 14:33:12 +01:00
Jeremy Bicha
63e15bb3a6 Don't set rpath (#629) 2017-12-07 20:39:07 +01:00
Bernard Spil
62194f204d Work around Linuxisms (#627)
Missed this in my previous tests. Sorry for that.

On BSDs, both bc and sed are part of the base operating system. For sed this results in an error as the check construct (--version) is a GNU-ism and only works for GNU sed, not for bsd sed.
Similarly, BSD sed does not take parameters after the filename(s) operated on. Moving `-i` to the front fixes that. `-r` is provided for GNU compat in BSD sed as an alias of `-E`. The `-i` option in BSD sed requires an extension to work in-place.

(thank you for picking up the nginx module too!)
2017-12-04 15:17:49 +01:00
Eugene Kliuchnikov
2d6b298e11 Update Travis matrix (#626)
* Use Clang-5.0
* Disable unholy ASAN leak detector (to unbreak build)
* Reduce build matrix and use faster env, where compiler version is not important
* Add autotools build to Travis matrix
2017-11-30 20:54:04 +01:00
Eugene Kliuchnikov
c8c8389ed3 Do not rely on bash arithmetic in bootstrap (#625) 2017-11-30 11:02:54 +01:00
Bernard Spil
1ca15159d6 Fix missing symbols errors in libbrotlienc and dec (#623)
When using autotools to build the binary and libraries, the resulting libraries don't link `brotlicommon` or `m`.
This was detected when building cURL 7.57.0 which has now has brotli support. During configure it was failing
```
checking run-time libs availability... failed
configure: error: one or more libs available at link-time are not available run-time. Libs used at link-time: -lbrotlidec -lz -L/usr/local/lib
```
inspection of config.log showed missing symbols from libbrotlicommon as the cause.

This patch results in the encryption and decryption libs to be properly linked against libbrotlicommon and libm.
See also https://bugs.freebsd.org/223966
2017-11-29 22:38:16 +01:00
Eugene Kliuchnikov
0ad94eed00 Update (#620)
* add autotools build
* separate semantic and ABI version
* extract sources.lst (used by CMake and Automake)
* share pkgconfig templates (used by CMake and Automake)
* decoder: always set `total_out`
* encoder: fix `BROTLI_ENSURE_CAPACITY` macro (no-op after preprocessor)
* decoder/encoder: refine `free_func` contract
2017-11-28 15:37:28 +01:00
Lode Vandevenne
273de5a22f Update shared-brotli-fetch-spec.txt 2017-11-17 01:38:05 +01:00
Lode Vandevenne
a755ba3bd0 Update shared-brotli-fetch-spec.txt 2017-11-16 19:15:13 +01:00
Lode Vandevenne
bdda95ee55 Create shared-brotli-fetch-spec.txt 2017-11-16 19:13:39 +01:00
Eugene Kliuchnikov
3e58ea5f90 Update (#617)
* remove `const` on `BrotliDictionary` members
 * extend `ZofliNode` distance range to 128MiB
 * add missing `port.h` include to `quality.h`
 * fix typo in encoder API-doc
 * regenerate `decode.min.js`
2017-10-13 14:50:51 +02:00
Eugene Kliuchnikov
39ef4bbdcf Add new (fast) dictionary generator engine. (#616)
Add CLI for dictionary generation.
Add BUILD file for research folder
2017-10-13 11:25:03 +02:00
Eugene Kliuchnikov
9c75a2a26a Use bazel in appveyor (#612)
+publish jni dll
2017-10-11 22:26:37 +02:00
Tomáš Popela
a0c7dafe28 Fix permissions of various files in project (#613)
Move from 755 to 644.
2017-10-10 11:24:13 +02:00
Eugene Kliuchnikov
42d78807bb Improve Bazel/JNI portability (#611)
* Improve Bazel/JNI portability
* Update go and closure bazel addons
2017-10-09 17:07:34 +02:00
Eugene Kliuchnikov
4f8cd4c0f4 Fix fuzzer test script and add it to travis matrix (#606) 2017-09-26 13:49:30 +02:00
Eugene Kliuchnikov
5b4769990d Ramp up to version to 1.0.1 2017-09-22 14:05:06 +02:00
Eugene Kliuchnikov
bf6a6cda56 Fix parallel test execution 2017-09-22 13:13:22 +02:00
Eugene Kliuchnikov
7748a1dc69 Update README.md 2017-09-22 10:28:15 +02:00
Eugene Kliuchnikov
c60563591a Fix API documentation + theoretical NPEs (#602) 2017-09-20 15:02:01 +02:00
Eugene Kliuchnikov
b6a017492e Install static libraries as well (#601) 2017-09-20 10:04:06 +02:00
Eugene Kliuchnikov
37fb83ec0d Update: (#600)
* encoder: relax backward references candidates asserts
 * encoder: make RNG more platform-independent
 * encoder: remove "unused" param (context mode)
 * CLI: improve first-encounter experience
 * Java: update SynthTest
 * Java: refine proguard config
 * Java/JNI: fix one-shot compression workflow
2017-09-19 15:57:15 +02:00
Eugene Kliuchnikov
61a5015938 Update README.md 2017-09-19 11:29:55 +02:00
Eugene Kliuchnikov
4760f7db47 Update MANIFEST.in 2017-09-19 11:25:55 +02:00
Eugene Kliuchnikov
248bddd0d7 Update README.md 2017-09-19 10:46:10 +02:00
Eugene Kliuchnikov
52f0483332 Build both static and shared libs with CMake (#599) 2017-09-19 09:40:48 +02:00
Eugene Kliuchnikov
87b43eb61b Reduce / update travis build matrix. (#598) 2017-09-18 13:52:53 +02:00
Eugene Kliuchnikov
6b1d0ab53d CI config
* Appveyor: publish artifacts on bintray
* Appveyor & Travis: build only master branch
2017-09-18 13:05:47 +02:00
Eugene Kliuchnikov
26a34a435c Employ make/gcc on Appveyor + push artifacts (#596) 2017-09-14 16:14:05 +02:00
Eugene Kliuchnikov
d7bce1e092 Update (#593)
* Update:
 * fix CLI error messages
 * fix CLI console IO on Windows
2017-09-07 20:27:49 +02:00
Eugene Kliuchnikov
fe09a5030c Update README.md 2017-09-04 14:54:51 +02:00
Eugene Kliuchnikov
a629289e32 Update (#590)
* add transpiled JS decoder
 * make PY wrapper accept memview
 * fix dictionary generator
 * speedup compression of RLEish data
2017-08-28 11:31:29 +02:00
Eugene Kliuchnikov
6535435413 Update (#589)
* cleanup
 * fix `unbrotli` CLI
 * Java retouch for faster JS decoder
2017-08-24 13:29:48 +02:00
Cosimo Lupo
4f455cac32 disable buidling/deployment of python wheels (#583)
* [appveyor] remove 'deploy' stage; only test python 2.7 and 3.6

all the other python versions are being built and tested on
https://github.com/google/brotli-wheels/blob/d571d63/appveyor.yml

* remove terrify submodule as not needed any more

* [travis] just test py2.7 and 3.6 on linux; remove extra osx python builds

All the other python versions for OSX are being built/tested on:
https://github.com/google/brotli-wheels/blob/d571d63/.travis.yml

Also, there's no need to build and deploy wheels here, as that's done
in the separate repository.

* [setup.py] only rebuild if dependency are newer; fix typo in list of 'depends'

https://github.com/python/cpython/blob/v3.6.2/Lib/distutils/command/build_ext.py#L485-L500

* [ci] only run 'python setup.py test'

if we run 'python setup.py built test', the setuptools 'test' command will
forcibly re-run the build_ext subcommand because it wants to pass the --inplace
option (it ignores whether it's up to date, just re-runs it all the time).

with this we go from running built_ext twice, to running it only once per build

* [Makefile] run 'build_ext --inplace' instead of 'develop' as default target

The 'develop' command is like 'install' in the sense that it
modifies the user's python environment.
The default make target should be less intrusive, i.e. just building
the extension module in-place without modify anything in the user's
environment.

We don't need to tell make about the dependency between 'test' and
'build' target as that is baked in the `python setup.py test` command.

* [Makefile] add 'develop' target; remove unnecessary 'tests' target

`make test` is good enough

* [Makefile] `setup.py test` requires setuptools; run `python -m unittest`

This will work even if setuptools is not installed, which is unlikely
nowadays but still our `setup.py` works with plain distutils, so
we may well have our tests work without setuptools.

* [python/README.md] add ref to 'develop' target; remove 'tests', just 'make test'

* [setup.py] import modules as per nicksay's comment

https://github.com/google/brotli/pull/583#discussion_r131981049

* [Makefile] add 'develop' to .PHONY targets

remove 'tests' from .PHONY

* [appveyor] remove unused setup scripts

We don't need to install custom python versions, we are
using the pre-installed ones on Appveyor.

* [appveyor] remove unneeded setup code
2017-08-23 20:45:13 +02:00
Alex Nicksay
019091f994 Python: Update bro_test to reference script directly (#582) 2017-08-08 10:25:39 +02:00
Eugene Kliuchnikov
3917011ddb Add link to 7Zip plugin 2017-08-04 15:40:57 +02:00
Eugene Kliuchnikov
d63e8f75f5 Update API, and more (#581)
Update API, and more:
 * remove "custom dictionary" support
 * c/encoder: fix #580: big-endian build
 * Java: reduce jar size
 * Java: speedup decoding
 * Java: add 32-bit CPU support
 * Java: make source code JS transpiler-ready
2017-08-04 10:02:56 +02:00
Alex Nicksay
0608253110 Python: Add a "make install" command and clarify installation docs (#578)
Closes #576
2017-08-02 16:59:46 +02:00
Alex Nicksay
bc541f74e1 Add an EditorConfig file to provide consistent style across editors. (#579) 2017-08-02 16:58:43 +02:00
Eugene Kliuchnikov
52441069ef Update (#574)
* Update
 * decoder: better behavior after failure
 * encoder: replace "len_x_code" with delta
 * research: add experimental dictionary generator
 * python: test combing
2017-07-21 10:07:24 +02:00
Denys Tsomenko
172a378deb add BROTLI_DEC_API to methods (#572) 2017-07-11 17:22:44 +02:00
Reza Tavakoli
5aabc7a6ab Added windows platform support to premake (#567)
* Added windows platform support to premake

Win32 and Win64 configuration support for visual studio solutions

* Update premake5.lua

Fixed platform support for linux, made x64 default

* Update premake5.lua

Fix typo
2017-07-10 19:57:05 +02:00
Eugene Kliuchnikov
1becbbf231 Update (#569)
* add misssing fclose in `brotli.c`
 * add basic tests for python `Decompressor` type
 * minor lint fixes in `_brotli.cc`
2017-06-30 13:09:50 +02:00
Janek
58f5c37f3b Python: Decompressor: Streaming decompression support (#546)
python-brotli has Compressor for streaming compression but nothing for
streaming decompression.
This is a straight-forward copy of the Compressor code into the new
class Decompressor.
2017-06-28 16:32:28 +02:00
Eugene Kliuchnikov
efdff3f14e Fix linux-bazel build (#566)
Install bazel via apt-source
2017-06-22 11:38:49 +02:00
Eugene Kliuchnikov
e51eae564f Update .travis.yml 2017-06-22 10:58:13 +02:00
Dominik Homberger
a423b33a77 Update Related projects (#565)
Add BrotliHaxe
2017-06-22 10:07:07 +02:00
Eugene Kliuchnikov
a4d2956ded Update wrappers (#564)
* golang: add build information via `cgo.go`
* golang: fix lgwin parameter behavior
* Java: add proguard configuration
2017-06-21 10:59:38 +02:00
Elouan Martinet
00cacfdff6 Fix compilation issue with BROTLI_ALLOC macro using GCC 7 (-Wint-in-bool-context) (#562) 2017-06-17 15:22:07 +02:00
Eugene Kliuchnikov
05d5f3d77a Update (#560)
Update:
 * add decoder API to avoid ringbuffer reallocation
 * fix MSVC warnings
 * remove dead code
2017-06-13 12:52:56 +02:00
Eugene Kliuchnikov
0fceb906ec Fix bazel go build (#558) 2017-06-07 12:47:48 +02:00
Mike Tzou
31d0629b1d Readme improvements (#557)
* [README] Use tools.ietf.org for displaying RFC7932

tools.ietf.org has HTML links which is helpful when reading
in browser

* [README] Add appveyor badge
2017-06-06 17:58:46 +03:00
Eugene Kliuchnikov
19dc934e39 Add JNI wrappers. (#556) 2017-06-01 13:51:18 +02:00
Eugene Kliuchnikov
03739d2b11 Update (#555)
Update:
 * new CLI; bro -> brotli; + man page
 * JNI wrappers preparation (for bazel build)
 * add raw binary dictionary representation `dictionary.bin`
 * add ability to side-load brotli RFC dictionary
 * decoder persists last error now
 * fix `BrotliDecoderDecompress` documentation
 * go reader don't block until necessary
 * more consistent bazel target names
 * Java dictionary data compiled footprint reduced
 * Java tests refactoring
2017-05-29 17:55:14 +02:00
Eugene Kliuchnikov
2c001010aa Unify artifact installation (#544) 2017-05-24 17:19:34 +02:00
Eugene Kliuchnikov
0a84e9bf86 Transpile Java speedup (#548) 2017-05-07 17:40:12 +02:00
Eugene Kliuchnikov
4363f2d74b Speedup Java decoder. (#547)
* geo corpus decodes ~5% faster
 * fetchlogs corpus decodes ~25% faster
2017-05-07 17:13:03 +02:00
Stefan Bodewig
a015b42683 turn java library into an OSGi bundle (#545) 2017-05-04 20:27:42 +02:00
Eugene Kliuchnikov
d00ccae57f Split auto-CMake and plain CMake build manual 2017-04-28 13:16:59 +02:00
Eugene Kliuchnikov
6ece1d8791 Move files & update paths (#541)
* Move files & update paths

* Rename build to scripts.

* Fix paths

* Fix script.
2017-04-23 14:07:08 +02:00
Eugene Kliuchnikov
04de756ad2 Simplify go brotli wrapper. (#540)
Based on PR #533.

Kudos to Bryan (bcmillis@).
2017-04-13 20:05:36 +02:00
Eugene Kliuchnikov
f2aa4d1e8c Add C# transpilation script. (#538) 2017-04-10 21:16:08 +02:00
Eugene Kliuchnikov
66e798d46a Update API to v1.0.0 (#537)
Make Java decoder fully transpilable to C#.
2017-04-10 15:39:00 +02:00
Eugene Kliuchnikov
46c1a881b4 Pull down version for v0.6.0 release 2017-04-10 10:42:24 +02:00
Eugene Kliuchnikov
21c118ba77 Update c- and java-decoder: (#536)
* speedup java decoder
   * avoid masking
   * avoid excessive fillBits
   * streamline uncompressed block processing
 * make java decoder more transpilation-friendly
 * avoid non-essential goto in c-decoder
2017-04-05 18:50:01 +02:00
Eugene Kliuchnikov
e12a7a2d28 Add Brotli logo to README head (#535) 2017-04-04 13:21:36 +02:00
Eugene Kliuchnikov
7a09531f69 Cleanup 2017-03-30 16:50:06 +02:00
Eugene Kliuchnikov
e5b7c16b98 Same file name is not permitted overall! (#532) 2017-03-24 14:49:02 +01:00
Eugene Kliuchnikov
e77799b084 Fix bintray release structure (#531)
Same file name is not allowed across packages in one version.
2017-03-24 14:31:30 +01:00
Eugene Kliuchnikov
51d6780b04 Actually publich artifacts to Bintray (#530) 2017-03-24 13:40:30 +01:00
Eugene Kliuchnikov
6715130c24 Fix bintray json (#529) 2017-03-24 13:03:16 +01:00
Eugene Kliuchnikov
187904a4b8 Upload binaries to bintray (#528) 2017-03-24 12:54:20 +01:00
Eugene Kliuchnikov
29ad4db47d Break build on sha256sum mismatch (#527) 2017-03-23 16:23:57 +01:00
Eugene Kliuchnikov
22421ebe30 Bazel build on linux/osx (#526) 2017-03-23 13:35:53 +01:00
Eugene Kliuchnikov
ee5c719057 Build and test java decoder with Maven 2017-03-22 19:13:59 +01:00
Eugene Kliuchnikov
a657d9969d Add go wrapper, streamline java decoder: (#524)
* add (c)brotli golang wrapper
 * remove (language-specific) enums in java decoder
2017-03-22 12:41:19 +01:00
Eugene Kliuchnikov
8a06e02935 Better compression (#523)
Better compression:
 * use more complex content modeling on 1MiB+ files
2017-03-21 16:08:23 +01:00
Eugene Kliuchnikov
1ff78b877f Prevent fuzzer timeouts on compression-bomb samples (#522)
* Prevent fuzzer timeouts on compression-bomb samples.

* Fix fuzzer lanucher
2017-03-10 16:01:49 +01:00
Eugene Kliuchnikov
52ce8670eb Fix typos (#521) 2017-03-09 17:34:16 +01:00
Eugene Kliuchnikov
cdca91b6f5 Update common, decoder, encoder, java (#520)
Common:
 * wrap dictionary data into `BrotliDictionary` structure
 * replace public constant with getter `BrotliGetDictionary`
 * reformat dictionary data

Decoder:
 * adopt common changes
 * clarify acceptable instance usage patterns
 * hold reference to dictionary in state

Encoder:
 * adopt common changes
 * eliminate PIC spots in `CreateBackwardReferences`
 * add per-chunk ratio guards for q0 and q1
 * precompute relative distances to avoid repeated calculations
 * prostpone hasher allocation/initialization
 * refactor Hashers to be class-like structure
 * further improvements for 1MiB+ inputs
 * added new hasher type; made hashers more configurable

Java:
 * Pull byte->int magic to `IntReader` from `BitReader`
2017-03-06 14:22:45 +01:00
Ian Duncan
aaa4424d9b Fix CMakeLists.txt specifying a nonexistent pkgconfig package (#518) 2017-03-01 16:19:57 +01:00
Eugene Kliuchnikov
c931e576d2 Move java/ to java/org/brotli/ to fix sources.jar structure (#517)
Also added man pages to `docs/`
2017-02-28 16:59:52 +01:00
Eugene Kliuchnikov
aaac88a1e0 Switch to 0.2.0-SNAPSHOT (#515) 2017-02-20 16:16:45 +01:00
Eugene Kliuchnikov
527db7af8c Release org.brotli.* 0.1.0 (#514) 2017-02-20 15:51:48 +01:00
Eugene Kliuchnikov
56a7fda830 Java: fix typos and adjust visibility. (#513) 2017-02-20 14:04:55 +01:00
Evan Nemerson
d03c38da7f Blacklist PGI from using conformant array parameters. (#511)
* Blacklist PGI from using conformant array parameters.

There is a bug in pgcc with conformant array parameters where the
length argument is a pointer which triggers a compiler error
(PGC-S-0094, to be specific).  The issue has been reported to PGI and
is being tracked internally as TPR 23778.  For more information, see

  https://www.pgroup.com/userforum/viewtopic.php?t=5501

* travis: Add PGI Community Edition build.

For details on the installation script, see
https://github.com/nemequ/pgi-travis
2017-02-19 10:06:13 +01:00
Eugene Kliuchnikov
53366083e0 Prepare org.brotli.dec for deployment. (#512) 2017-02-17 15:39:34 +01:00
Eugene Kliuchnikov
9fa1ad5a91 Fix "zero-distance-code", take 2 (#506) 2017-02-08 21:14:01 +01:00
Eugene Kliuchnikov
0749d9ca8b Fix #502 decoder bug (#505)
Decoder may have produced invalid output if:
 * at offset 0..3 dictionary word with index 3..0 for some length N is used
   and distance is encoded with direct distance code 0, and
 * at least one of next 4 commands use value from distance ringbuffer
2017-02-07 15:35:03 +01:00
Eugene Kliuchnikov
11df843cf0 Update encoder (#504)
* pull `BROTLI_MAX_BACKWARD_LIMIT` to constants
 * split generic and Zopfli backward references code
 * pull hashers init and stitch invocation to encoder
 * make `dictionary_hash` a compilation unit
 * add `size hint` parameter
 * add new hasher
 * use `size hint` to pick new hasher for q4
 * modernize clz guard (fix #495)
 * move `hash to binary tree` to separate file
 * add `Initialize` and `Cleanup` to all hashers
 * do not raise OOM if malloc(0) == NULL (fix #500)
2017-02-06 14:20:43 +01:00
Eugene Kliuchnikov
8d3fdc1dfe Update encoder (#497)
* pad dictionary LUTs to length 32, etc. (#493)
 * avoid using INFINITY constant (#496)
 * make dictionary_hash.h more compact
 * add "disable literal context modelling" parameter
2017-01-26 11:32:18 +01:00
Eugene Kliuchnikov
7e347a7c84 Update encoder (#492)
* fix comment position in `context.h`
 * fix typo in internal quality constant name
 * deduplicate `BuildMetaBlockGreedy` code
 * simplify aggregation in `ChooseContextMap`
2016-12-22 15:55:05 +01:00
Eugene Kliuchnikov
27d94590a2 Research (#491)
* add advanced mode for optimal references generator
 * fix #489

Thanks to Ivan Nikulin for working on it.
2016-12-22 13:03:28 +01:00
Eugene Kliuchnikov
fe9f9a9182 Split brotli common/dec/enc .pc files (#490)
Add URL, and use DEPENS_PRIVATE generator params
2016-12-22 08:57:44 +01:00
Alex Nicksay
6ab0a5cee7 Python: Create Makefile for development shortcuts (#488) 2016-12-21 10:17:11 +01:00
Eugene Kliuchnikov
fd96151b2a Move brotlidump.py to research/ (#487) 2016-12-20 18:00:51 +01:00
Eugene Kliuchnikov
5814438791 Add configure-cmake (#474) (#486)
* Add configure-cmake

 * `curl https://raw.githubusercontent.com/nemequ/configure-cmake/7b0464af79bbaca535f0279316558e1d84e5c124/configure > configure-cmake`

* Add `--disable-shared-libs` parameter.

* Unix-friendly script prologue.

* Update README.md
2016-12-20 17:45:40 +01:00
jneb
f62cd2bcd2 brotlidump.py: disassemble brotli file (revisited) (#314)
* Create brotlidump.py

Sorry, I am a newbie. I couldn't find my file anymore when I wanted to edit it. Hope I don't waste your time.

* Fixed a bug where it couldn't read its own compression.

The problem was that a prefix code ending with a 16 "repeat" didn't realize the table was full already.
Also minor bug fixes, comments and stuff.

* Major refactoring

Rewrote almost everything.
Now can dump its own compression.

* Now more or less complete

Appears to handle all files completely (including metablock data).
Used as inspiration for the the hex example (see makehexexample.py)
2016-12-20 14:41:47 +01:00
Alex Nicksay
89a5b6e625 Python: Simplify test suite generation by using unittest discovery (#485) 2016-12-20 14:40:47 +01:00
Alex Nicksay
6f227228ce Python: Use a temporary directory for generated files in tests (#481) 2016-12-12 10:28:44 +01:00
Alex Nicksay
4651f7c000 Python: Format bro.py with yapf (#480) 2016-12-12 10:28:15 +01:00
Eugene Kliuchnikov
0ee416139f Update python brotli wrapper (#479)
* Update python brotli wrapper
 * release GIL on CPU intensive blocks, fixes #476
 * use BrotliDecoderTakeOutput (less memory, less memcpy)

* Python: Convert bro.py tests to unittest style (#478)

* Create unittest-style tests for `bro.py` decompression and compression
* Delete old tests for `bro.py`
* Update test method generation to properly create a Cartesian product
  of iterables using `itertools.product`

* Update python brotli wrapper
 * release GIL on CPU intensive blocks, fixes #476
 * use BrotliDecoderTakeOutput (less memory, less memcpy)
2016-12-12 10:27:13 +01:00
Alex Nicksay
4a60128c13 Python: Convert bro.py tests to unittest style (#478)
* Create unittest-style tests for `bro.py` decompression and compression
* Delete old tests for `bro.py`
* Update test method generation to properly create a Cartesian product
  of iterables using `itertools.product`
2016-12-09 13:44:05 +01:00
Frank Denis
50bc3a7145 Do not assume that bash is installed in /bin (#477)
This is required in order to run the tests on *BSD.
2016-12-09 08:58:13 +01:00
Eugene Kliuchnikov
ccabf811ff Added fuzzer and updated decoder (#475)
* log dictionary usage
 * remove dead assignment
 * added fuzzer for https://github.com/google/oss-fuzz
 * added standalone test for fuzzer
2016-12-08 12:55:18 +01:00
Eugene Kliuchnikov
222564a95d Fix encoder (#472)
* fix undefined behavior introduced with PR #468
2016-12-02 13:32:20 +01:00
Piotr Sikora
6a4bf43968 Fix build with -Wconditional-uninitialized. (#471)
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
2016-12-02 09:52:54 +01:00
Eugene Kliuchnikov
396309a529 Update (#470)
* condense generated `static_dict_lut.h`
 * implement BrotliInputStream.close()
2016-11-30 13:36:20 +01:00
Eugene Kliuchnikov
5db62dcc9d Fixes: (#468)
* fix slow-down after a long copy (q10-11)
 * more thorough hashing for long ranges (q10-11)
 * minor documentation fixes
 * bazel.io -> bazel.build
2016-11-09 14:04:09 +01:00
Alex Nicksay
1e5ea6aedd Python: Add unit tests for brotli.compress and brotli.decompress (#467)
Also
  - rename `test_utils` to `_test_utils`
  - refactor shared code into `_test_utils`
2016-11-09 12:21:13 +01:00
Evan Nemerson
12750768c2 bro: check return values of chown and chmod (#465)
Apparently some libc versions declare chown with the warn_unused_result
attribute, which is enabled by default.
2016-11-02 14:03:06 +01:00
Evan Nemerson
6c47009892 FInishing touches for installing libbrotli with CMake (#464)
* build: fix bundled mode + BUILD_SHARED_LIBS

* cmake: add soversion information

* cmake: generate pkg-config file
2016-11-01 10:03:29 +01:00
Eugene Kliuchnikov
e9b278ac6e Update docs and add more java tests (#463)
* doxygenize and update API documentation
 * fix spelling
 * add "fuzz" corpus for java decoder to improve coverage
 * use upper-case-snake names for dictionary constant definitions
 * use `LDFLAGS` in conventional `Makefile`
2016-10-31 14:33:59 +01:00
Alex Nicksay
a260b6ba73 Python: Add tests for streamed compression (#458)
Progress on #191
2016-10-31 13:24:01 +01:00
Alex Nicksay
9203765492 Python: Use "build" instead of "build_ext" in scripts (#460)
Previously, the Python package consisted of a single extension
module, so `build_ext` was sufficient.  Now, the package
contains a native module and an extension module, so both
`build_py` and `build_ext` are required.  Instead, run `build`,
which calls both `build_py` and `build_ext` automatically.
2016-10-31 12:58:45 +01:00
Alex Nicksay
1a8ee40de9 Python: Run Appveyor tests in CMD mode (#461)
Any command executed in PowerShell mode that writes to `stderr`
is treated as failing.  To avoid this problem, run tests in CMD
mode instead.
2016-10-27 17:25:05 +02:00
Mo DeJong
3b9d4a227d enable rbit instruction for arm64 (#459) 2016-10-27 17:21:12 +02:00
Eugene Kliuchnikov
4e157c409a Update API (#457)
* explicitly define `BROTLI_BOOL` to be `int`
 * add `BROTLI_` prefix to `MAKE_UINT64_T` macros
 * replace `true`/`false`/`1`/`0` mentions with `BROTLI_TRUE`/`FALSE`
 * add `BrotliEncoderSetParameter` documentation
 * add explicit caution to `BrotliEncoderMaxCompressedSize`
 * fix formatting in `port.h`
2016-10-25 16:02:05 +02:00
Alex Nicksay
afb1272792 Python: Publicly expose the Compressor object in the Python API (#456)
Progress on #191
2016-10-25 10:19:29 +02:00
Alex Nicksay
5632315d35 Python: Support streamed compression with the Compressor object (#448)
This adds `flush` and `finish` methods to the `Compressor`
object in the extension module, renames the `compress` method to
`process`, and updates that method to only process data.  Now,
one or more `process` calls followed by a `finish` call will be
equivalent to a module-level `compress` call.

Note: To maximize the compression efficiency (and match
underlying Brotli behavior, the `Compressor` object `process`
method does not guarantee all input is immediately written to
output. To ensure immediate output, call `flush` to manually
flush the compression buffer.  Extraneous flushing can increase
the size, but may be required when processing streaming data.

Progress on #191
2016-10-24 13:28:56 +02:00
Eugene Kliuchnikov
678f8627d3 Fix OSX gcc-4.x compilation (#455)
Fix OSX gcc-4.x compilation
2016-10-20 14:16:00 +02:00
Eugene Kliuchnikov
b1db6f149a Fix -Wcast-align warnings 2016-10-20 10:28:44 +02:00
Eugene Kliuchnikov
74147a1a41 Merge pull request #454 from fred-wang/brotli-readme
Fix build instructions for cmake
2016-10-19 22:27:45 +02:00
Frédéric Wang
82c297f356 Fix build instructions for cmake 2016-10-19 21:44:42 +02:00
Eugene Kliuchnikov
058a113dd9 Merge pull request #451 from eustas/flush
Fix "take output" flush workflow.
2016-10-19 20:03:43 +02:00
Eugene Kliuchnikov
8bcaabb0d1 Fix "take output" flush workflow. 2016-10-19 16:19:26 +02:00
Eugene Kliuchnikov
1b364aeb42 Merge pull request #450 from eustas/master
Build shared libraries by default
2016-10-18 20:52:04 +02:00
Eugene Kliuchnikov
b93cb69831 * leave static compilation declaration intouch (e.g. Python build) 2016-10-18 17:14:49 +02:00
Eugene Kliuchnikov
f5ba0b6c17 (compress_fragment_two_pass) 2016-10-18 16:56:39 +02:00
Eugene Kliuchnikov
69982c25f1 Build shared libraries by default
* Declare `BUILD_SHARED_LIBS` option for CMake
* Define `${LIB}_SHARED_COMPILATION` when compiling shared library
* Define and use BROTLI_xxx_API
* Fix remaining unprefixed defines in port.h
2016-10-18 16:45:32 +02:00
Eugene Kliuchnikov
0781cb10ab Merge pull request #449 from eustas/master
Fix POM files sources paths
2016-10-18 15:29:39 +02:00
Eugene Kliuchnikov
d18c7369d9 Fix POM files sources paths
* also add javadocs and sources generation
2016-10-18 15:28:43 +02:00
Eugene Kliuchnikov
2d441179bb Merge pull request #446 from nicksay/py-3-compressor-object
Python: Create an extension Compressor object
2016-10-18 10:31:05 +02:00
Eugene Kliuchnikov
606a70b779 Merge pull request #447 from nicksay/py-yapf
Python: Update README with information about code formatting
2016-10-18 10:28:28 +02:00
Eugene Kliuchnikov
81962c3892 Merge pull request #444 from eustas/master
Eliminate more magic constants.
2016-10-18 10:27:22 +02:00
Alex Nicksay
b04f4ea185 Python: Update README with information about code formatting
Also, add a `yapf` section to `setup.cfg` to ensure YAPF runs
format code with the Google style.
2016-10-17 13:57:56 -04:00
Alex Nicksay
595a5246b4 Python: Create an extension Compressor object
- Create a `Compressor` object in the extension module
- Move the `compress` method into the native module and use
  the new `Compressor` object to do the compression

Note: This does not change the module-level Python API.  The
`Compressor` object will not be publicly exposed until its
methods have stabilized.
2016-10-17 13:03:58 -04:00
Eugene Kliuchnikov
d60aa23116 Merge pull request #443 from nicksay/py-2-package-structure
Python: Create native brotli module and move extension to _brotli
2016-10-17 17:40:32 +02:00
Eugene Kliuchnikov
9521d968f3 Eliminate more magic constants.
Author: Ivan Nikulin
2016-10-17 17:33:12 +02:00
Eugene Kliuchnikov
4219fece59 Merge pull request #424 from mdejong/master
check for __ARM64_ARCH_8__ in dec/port.h so that arm64 arch under cla…
2016-10-17 15:44:39 +02:00
Alex Nicksay
f7b5b3dc2c Python: Create native brotli module and move extension to _brotli 2016-10-17 09:35:27 -04:00
Eugene Kliuchnikov
541dd651e0 Merge pull request #435 from nicksay/py-1-cleanup-setup
Python: Clean up setup.py file
2016-10-17 14:31:42 +02:00
Eugene Kliuchnikov
d767ab9eb0 Merge pull request #439 from fred-wang/remove-underscore
Remove the underscore in the name of brotli libraries. #326
2016-10-17 14:29:38 +02:00
Eugene Kliuchnikov
bc658c2501 Merge pull request #440 from fred-wang/cmake
CMake: Also add ARCHIVE DESTINATION for non-WIN32
2016-10-17 14:20:40 +02:00
Eugene Kliuchnikov
54dc9b0cf1 Merge pull request #441 from fred-wang/readme
Add some basic build instructions in the README.md #166
2016-10-17 14:19:33 +02:00
Eugene Kliuchnikov
616ed51e6e Merge pull request #442 from eustas/master
Add Java port of Brotli decoder.
2016-10-17 14:17:57 +02:00
Eugene Kliuchnikov
5025365d0f Add Java port of Brotli decoder. 2016-10-17 14:04:59 +02:00
Frédéric Wang
8db7411bbc Add some basic build instructions in the README.md #166 2016-10-12 22:00:35 +02:00
Frédéric Wang
1c7776605d CMake: Also add ARCHIVE DESTINATION for non-WIN32 2016-10-12 21:20:04 +02:00
Frédéric Wang
ed2748abbf Remove the underscore in the name of brotli libraries. #326 2016-10-12 18:24:04 +02:00
Alex Nicksay
6f55ee6097 Python: Clean up setup.py file 2016-10-12 11:43:14 -04:00
Eugene Kliuchnikov
85817beba8 Merge pull request #437 from fred-wang/cmake-include
Also install the brotli headers when building the shared libraries. #326
2016-10-12 18:40:37 +03:00
Frédéric Wang
9389876ee9 Add ARCHIVE destination for Windows. 2016-10-12 16:59:11 +02:00
Frédéric Wang
c41962f0ae Use install directories provided by GNUInstallDirs. 2016-10-12 16:12:13 +02:00
Eugene Kliuchnikov
6244e69062 Merge pull request #438 from google/eustas-fix-osx
Use system version of compiler with macpython
2016-10-12 16:31:19 +03:00
Eugene Kliuchnikov
db4cfc1219 Use system version of compiler with macpython 2016-10-12 15:08:41 +02:00
Frédéric Wang
82536d2bae Also install the libraries and headers when building static libraries. 2016-10-12 14:53:37 +02:00
Frédéric Wang
cd8153a1ed Do not install the public headers on WIN32. #326 2016-10-12 14:33:19 +02:00
Frédéric Wang
93933405d3 Actually use BROTLI_INCLUDE_DIRS to get the path to headers. 2016-10-07 22:52:02 +02:00
Frédéric Wang
89a77a94bb Also install the brotli headers when building the shared libraries. #326 2016-10-07 22:26:26 +02:00
Eugene Kliuchnikov
a9f2344f41 Merge pull request #434 from eustas/master
Update research
2016-09-22 12:38:33 +02:00
Eugene Kliuchnikov
dd8fa3e8dd Update research
* don't use `assert` when side-effect is desired
 * use `gflags` to pick options from args

Other changes:
 * teach stub `Makefile` to do partial rebuild
 * remove obsolete `tools/version.h`
2016-09-22 11:32:23 +02:00
Eugene Kliuchnikov
25444e8858 Merge pull request #433 from eustas/master
Update encoder
2016-09-21 19:25:04 +02:00
Eugene Kliuchnikov
0a63f99db9 Update encoder
* move `common/port.h` to `includes/port.h`
 * replace magic more magic numbers with constants
 * artificially limit window size to 2^18 for quality 0 and 1
 * use fixed shifts for quality 0 and 1 hashes
 * removed `BrotliEncoderWriteMetadata`
 * added `BROTLI_OPERATION_EMIT_METADATA` instead
 * deprecated low-level API
 * fixed MSVC warnings
2016-09-21 17:20:36 +02:00
Eugene Kliuchnikov
97fb2090c7 Merge pull request #431 from eustas/master
Update decoder
2016-09-21 16:16:15 +02:00
Eugene Kliuchnikov
86fdb68373 Update brotlimodule.cc 2016-09-21 16:02:32 +02:00
Eugene Kliuchnikov
7cbdb4aa0c Update brotlimodule.cc 2016-09-21 15:51:54 +02:00
Eugene Kliuchnikov
de1007f05a Fix uid/gid types 2016-09-21 15:44:56 +02:00
Eugene Kliuchnikov
b754f607aa Update python module
* use new decoder API
2016-09-21 15:37:45 +02:00
Eugene Kliuchnikov
9223fd4d8d Update bro tool:
* use new decoder API
 * copy permissions and modification time to output file
2016-09-21 15:18:35 +02:00
Eugene Kliuchnikov
f20b3eeb2f Update decoder:
* use BROTLI_MAX_DISTANCE_BITS instead of magic constant
 * introduce BROTLI_DEPRECATED
 * move BROTLI_RESTRICT to common/port.h
 * promote reg_t to dec/port.h
 * remove deprecated decoder API
 * more optimistic ring-buffer allocation
 * fix MSVC warnings
 * fix (not tested) for ARM 64-bit RBIT
2016-09-21 15:05:12 +02:00
Eugene Kliuchnikov
20e36ef2f7 Merge pull request #427 from Zip753/master
Add backward reference research tools to brotli repository.
2016-09-20 10:32:34 +02:00
Eugene Kliuchnikov
56c890d5f9 Merge pull request #429 from google/fix-homebrew-gcc-6
Fix integration build
2016-09-20 10:30:42 +02:00
Eugene Kliuchnikov
ea9c51e4ca Update .travis.yml 2016-09-20 10:11:46 +02:00
Eugene Kliuchnikov
887f6fd48f Fix integration build
Homebrew started getting stuck with gcc-6
Temporarily switch to gcc-5
2016-09-20 10:10:21 +02:00
Ivan Nikulin
9294022929 Replace sais.hxx by submodule hillbig/esaxx. 2016-09-19 19:12:30 +02:00
Ivan Nikulin
4291932022 Update research tools description. 2016-09-15 17:19:26 +02:00
Ivan Nikulin
0e52c59a07 Update variable naming. 2016-09-15 16:59:52 +02:00
Ivan Nikulin
9589396e5d Add description of research tools. 2016-09-15 11:34:19 +02:00
Ivan Nikulin
58cecf1783 Add distance encoding research tools. 2016-09-15 10:44:19 +02:00
Mo DeJong
214629ccd7 check for __ARM64_ARCH_8__ in dec/port.h so that arm64 arch under clang is detected, check for __ARM_ARCH being exactly equal to 7 so that arm64 arch does not define BROTLI_TARGET_ARMV7 2016-08-30 12:35:40 -07:00
Eugene Kliuchnikov
5ce9bf11b3 Merge pull request #423 from PiotrSikora/libm
Bazel: link ":brotli_enc" with -lm.
2016-08-29 13:09:09 +02:00
Piotr Sikora
b4f8c7814a Bazel: link ":brotli_enc" with -lm.
While this isn't strictly necessary with recent versions of Bazel
(which unconditionally add -lm to linkopts), building Brotli with
older versions of Bazel requires -lm to be added explicitly.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
2016-08-29 02:32:12 -07:00
Eugene Kliuchnikov
d7c4da8ad0 Merge pull request #421 from fred-wang/cmake-shared-libraries
Add support for CMake's BUILD_SHARED_LIBS option. #326
2016-08-27 17:45:57 +02:00
Eugene Kliuchnikov
55dd5d7824 Merge pull request #422 from fred-wang/cmake-typo
Fix typo in CMakeFile: s/BROTLI_BUNDLE_MODE/BROTLI_BUNDLED_MODE/
2016-08-27 17:42:08 +02:00
Frédéric Wang
5e1219a32d Fix typo in CMakeFile: s/BROTLI_BUNDLE_MODE/BROTLI_BUNDLED_MODE/ 2016-08-27 12:14:50 +02:00
Frédéric Wang
074e4acd8a Add support for CMake's BUILD_SHARED_LIBS option. #326 2016-08-27 12:05:51 +02:00
Eugene Kliuchnikov
e7f47b9470 Merge pull request #418 from PiotrSikora/bazel_cc_library
Bazel: use cc_library instead of cc_inc_library.
2016-08-24 11:54:07 +02:00
Eugene Kliuchnikov
f14b38d84f Merge pull request #420 from PiotrSikora/bazel_license
Bazel: export LICENSE file.
2016-08-24 11:07:50 +02:00
Piotr Sikora
d0391c99ee Bazel: export LICENSE file.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
2016-08-23 19:58:51 -07:00
Piotr Sikora
2cc33230f4 Bazel: use cc_library instead of cc_inc_library.
cc_inc_library is broken when used with external repositories
(see: https://github.com/bazelbuild/bazel/issues/1596), which
makes it a bit useless at the moment.

Switch to using cc_library with "includes" attribute to expose
public headers.

While there, fix order of attributes in ":brotli_common" target.

Signed-off-by: Piotr Sikora <piotrsikora@google.com>
2016-08-23 15:43:07 -07:00
Eugene Kliuchnikov
85cc650a11 Merge pull request #417 from eustas/master
Move "public" -> "include/brotli"
2016-08-23 16:21:58 +02:00
Eugene Kliuchnikov
532921b8c8 Fix CMake includes. 2016-08-23 15:35:54 +02:00
Eugene Kliuchnikov
3dcc3c02d0 Fix CMake includedirs 2016-08-23 14:49:37 +02:00
Eugene Kliuchnikov
6e2207f486 Add includedir to setup.py 2016-08-23 14:44:13 +02:00
Eugene Kliuchnikov
8148001158 Move "public" to "include/brotli" 2016-08-23 14:40:33 +02:00
Eugene Kliuchnikov
2032f41ffa Merge pull request #416 from sylvestre/master
Remove some deadcode
2016-08-23 10:32:05 +02:00
Sylvestre Ledru
5ad715e445 Remove some deadcode 2016-08-22 18:22:24 +02:00
Eugene Kliuchnikov
7ecd22939c Merge pull request #415 from eustas/master
Use version from common/version.h
2016-08-22 16:29:13 +02:00
Eugene Kliuchnikov
ceddddeb2b Fix setup.py regexp 2016-08-22 15:59:08 +02:00
Eugene Kliuchnikov
2c2d5578a6 Use version from common/version.h 2016-08-22 15:44:12 +02:00
Eugene Kliuchnikov
313066a037 Merge pull request #414 from eustas/master
Update brotli to ToT
2016-08-22 14:48:09 +02:00
Eugene Kliuchnikov
dae9c7ffd0 Fix brotlimodule 2016-08-22 14:00:42 +02:00
Eugene Kliuchnikov
403729c454 Fix setup.py 2016-08-22 13:59:21 +02:00
Eugene Kliuchnikov
801f5f37ee * rename macros with preceding underscore
* add Brotli*TakeOutput methods
* * flushing now doesn't require additional call
* add Brotli*Version methods
* moved public headers to 'public' directory
* removed C++ API
* do not assume STDC_VERSION is defined
2016-08-22 13:28:22 +02:00
Eugene Kliuchnikov
2e0d3214c2 Merge pull request #407 from anthrotype/fix-python-build
Fix python build
2016-08-11 10:59:52 +02:00
Cosimo Lupo
11dc16bf91 .travis.yml: fix wheel deploy 'condition' (use '&&', not '||') 2016-08-10 19:29:48 +01:00
Cosimo Lupo
52cb947469 .travis.sh: skip source venv on linux 2016-08-10 19:27:17 +01:00
Cosimo Lupo
13b8e7ff60 .travis.yml: only deploy OSX wheels 2016-08-10 19:22:41 +01:00
Cosimo Lupo
18dfca9bc1 .travis.sh: only build wheels on OSX 2016-08-10 18:43:41 +01:00
Cosimo Lupo
2dc2ac7701 .travis.sh: explicitly use 'bash' shell instead of generic 'sh'
I need to use 'source' command to activate the python virtual environment,
but apparently that is not available on Ubuntu's default /bin/sh
2016-08-10 18:33:43 +01:00
Cosimo Lupo
e944f1c92b .travis.yml: update Python versions
Removed unnecessary builds for homebrew and system python.
We only use the official Mac Python distributions from Python.org.
The wheels compiled with those will work on both homebrew Python and
the OSX built-in Python.
2016-08-10 18:12:36 +01:00
Cosimo Lupo
a5bf2c0bdd .travis.sh: activate virtualenv before calling python or pip commands 2016-08-10 18:05:38 +01:00
Eugene Kliuchnikov
014f651114 Add benchmarks section 2016-08-10 14:44:59 +02:00
Eugene Kliuchnikov
5d8b0f3a0c Merge pull request #400 from nemequ/master
appveyor: add Visual Studio builds (based on CMake)
2016-08-10 13:30:59 +02:00
Evan Nemerson
fe0e153cb8 cmake: use a different variable for testing with and without libm
CMake seems to cache the result when using the same variable, at least
with some versions, so previously systems requiring libm for log2 may
not have worked as expected.
2016-08-04 18:51:20 -07:00
szabadka
e0a87638d8 Merge pull request #404 from szabadka/master
Update the spec reference to RFC 7932, remove the old internet draft.
2016-08-02 13:46:11 +02:00
Zoltan Szabadka
af1768478a Update the spec reference to RFC 7932, remove the old internet draft. 2016-08-02 13:27:29 +02:00
Evan Nemerson
c1ec7ba292 appveyor: add Visual Studio builds (based on CMake)
This only goes back to VS 12 (2013) because MSVC didn't support log2
until then.
2016-07-29 13:24:17 -07:00
Eugene Kliuchnikov
24685f9bf3 Merge pull request #399 from nemequ/master
Add mingw support.
2016-07-29 10:38:09 +02:00
Evan Nemerson
01f9cf94a0 travis: use container-based infrastructure 2016-07-28 20:13:29 -07:00
Evan Nemerson
03657e8089 Add mingw support. 2016-07-28 19:32:33 -07:00
Eugene Kliuchnikov
bd3a5ada1a Merge pull request #397 from nemequ/master
Add CMake, lots of Travis configurations
2016-07-28 10:18:20 +02:00
Eugene Kliuchnikov
882f41850b Merge pull request #396 from eustas/master
Fix ubsan warning.
2016-07-27 11:12:23 +02:00
Eugene Kliuchnikov
e0af054d9e Fix ubsan warning. 2016-07-27 11:02:27 +02:00
Evan Nemerson
97b846f2fe travis: disable gcc-4.5 build
Travis seems to have trouble with the PPA.
2016-07-26 20:59:28 -07:00
Evan Nemerson
26a59359ed Add UBSan build, use clang for sanitizer builds. 2016-07-26 17:31:26 -07:00
Evan Nemerson
45862fcb06 Enable address sanitizer build. 2016-07-26 08:53:26 -07:00
Evan Nemerson
1d15c95363 travis: enable clang builds now that the LLVM repos are back up 2016-07-26 08:53:26 -07:00
Evan Nemerson
37be4e37ce travis: add many additional builds 2016-07-26 08:53:26 -07:00
Evan Nemerson
3e33d7636a travis: add CMake builds 2016-07-26 08:53:26 -07:00
Evan Nemerson
93ef13f823 Add CMake build system. 2016-07-26 08:53:26 -07:00
Eugene Kliuchnikov
8b7e3c7ba5 Merge pull request #218 from gtalusan/master
allow output file to be overwritten if --repeat
2016-07-26 16:41:34 +02:00
George Talusan
a531496f23 allow output file to be overwritten if --repeat 2016-07-26 10:08:54 -04:00
Eugene Kliuchnikov
a5a38bd7a1 Merge pull request #390 from chad-iris/master
Add a "lib" target to the Makefile to build a static library: libbrotli.a
2016-07-26 16:00:18 +02:00
Eugene Kliuchnikov
de87f114ba Merge pull request #395 from eustas/master
Update encoder
2016-07-26 15:56:30 +02:00
Eugene Kliuchnikov
058cd1ac7a cleanup 2016-07-26 14:51:51 +02:00
Eugene Kliuchnikov
2048189048 Update encoder:
* booleanification
 * integer BR scores, may improve performance if FPU is slow
 * condense speed-quality constants in quality.h
 * code massage to calm down CoverityScan
 * hashers refactoring
 * new hasher - improved speed, compression and reduced memory usage for q:5-9 w:10-16
 * reduced static recources -> binary size
2016-07-26 14:41:59 +02:00
Eugene Kliuchnikov
aa96d64ce1 Merge pull request #394 from eustas/master
Update decoder API
2016-07-26 10:12:23 +02:00
Eugene Kliuchnikov
91cbcf9ed1 Fix MSVC stdbool inclde 2016-07-25 10:56:23 +02:00
Eugene Kliuchnikov
0ef4edacab Do not use "static inline" for deprecated API. 2016-07-25 10:36:36 +02:00
Eugene Kliuchnikov
d2b17196f2 Fix DecodeContextMap 2016-07-25 10:22:33 +02:00
Eugene Kliuchnikov
43d4f45b6e Update decoder API:
* replace prefix Brotli -> BrotliDecoder
* add HasMoreOutput
* make instance pointer the first argument
* use boolean instead of int
2016-07-25 10:17:04 +02:00
chad-iris
8529bfa1f7 Add a "lib" target to the Makefile to build a static library named libbrotli.a 2016-07-06 10:09:04 -07:00
Eugene Kliuchnikov
7acb34f7cc Merge pull request #386 from eustas/master
Update build systems
2016-06-24 16:22:23 +02:00
Evgenii Kliuchnikov
52ff81717b Update build systems 2016-06-24 15:32:51 +02:00
eustas
49df97c472 Merge pull request #385 from google/eustas-patch-1
Fix issue #383
2016-06-23 22:04:55 +02:00
eustas
27f9d00efc Fix issue #383 2016-06-23 11:02:53 +02:00
eustas
2c16351987 Update readme
Added "Related projects" section
2016-06-21 13:41:25 +02:00
eustas
bdf54ed9e6 Merge pull request #382 from eustas/master
Update encoder and add xcode projects
2016-06-20 17:50:57 +02:00
Eugene Kliuchnikov
6a078b17d7 Update encoder
* reorganize premake artifacts
* remove deprecated methods/struct
2016-06-20 17:20:17 +02:00
eustas
f836c9459f Merge pull request #381 from eustas/master
Fix wheels build
2016-06-17 21:52:40 +02:00
eustas
902b20c0b8 Remove command line args
Values should be taken from setup.cfg
2016-06-17 20:25:28 +02:00
eustas
81054a4667 Restore platform suffix 2016-06-17 20:24:24 +02:00
Eugene Kliuchnikov
1bcdb45bad Fix wheels build. 2016-06-17 20:10:31 +02:00
eustas
3116718689 Merge pull request #379 from eustas/master
Fix VS build problems
2016-06-17 16:39:00 +02:00
Eugene Kliuchnikov
b32cefe160 Fix VS build problems:
* rename build -> buildfiles to avoid clashing with BUILD
* set binary mode for stdin/out in bro
* convert bro to C
2016-06-17 16:24:51 +02:00
eustas
bce5c9b78c Merge pull request #378 from eustas/master
Fix travis build
2016-06-16 17:59:04 +02:00
eustas
f0c7ece315 Remove platform suffix 2016-06-16 17:45:05 +02:00
eustas
bac060f1a1 Fix library path 2016-06-16 17:25:26 +02:00
eustas
cac71ca2f6 Put library also to bin 2016-06-16 17:23:21 +02:00
eustas
41e4fb8bf3 Reroute -t parameter to build_ext 2016-06-16 15:36:11 +02:00
eustas
198b51d969 Place binary output to bin/ 2016-06-16 15:31:09 +02:00
eustas
7152be88ec Remove deleted files 2016-06-16 15:30:18 +02:00
eustas
b4d12191fd Merge pull request #377 from eustas/master
Update build system. Now libraries are produced as build artifacts.
2016-06-16 14:34:41 +02:00
Eugene Kliuchnikov
f9ab24a7aa Fix gitignore 2016-06-16 11:06:47 +02:00
Eugene Kliuchnikov
378485b097 Update build system. Now libraries are produced as build artifacts.
There are currently 3 ways to build:
* Easy: `./configure; make`
* Simple: use Bazel
* Portable: use premake5 to generate XCode / MSVS projects
2016-06-16 10:52:57 +02:00
eustas
4a8bb6a2b8 Merge pull request #376 from PiotrSikora/cpp_constants
Restore C++ constants in "brotli" namespace.
2016-06-16 10:08:38 +02:00
Piotr Sikora
629d01cc7e Restore C++ constants in "brotli" namespace.
Signed-off-by: Piotr Sikora <piotrsikora@google.com>
2016-06-15 14:25:46 -07:00
eustas
e4c891420c Merge pull request #375 from google/v0.5
Update to v0.5
2016-06-15 12:04:04 +02:00
eustas
cd33d2757c Merge pull request #373 from eustas/v0.5
Rebase
2016-06-14 16:06:45 +02:00
eustas
95151f9525 Update version to 0.5.0 2016-06-14 16:05:49 +02:00
Eugene Kliuchnikov
ee6bb31599 Merge branch 'v0.5' of https://github.com/eustas/brotli into v0.5 2016-06-14 16:03:51 +02:00
eustas
e208eff300 Fix: declare variables before code 2016-06-14 16:02:38 +02:00
Eugene Kliuchnikov
986aa099ac Fix VS compilation warnings; cleanup API. 2016-06-14 16:02:38 +02:00
eustas
048b14c65f Fix declaration / instruction mixture 2016-06-14 16:02:38 +02:00
eustas
999a3ce9b2 Fix implicit 32->64 bit conversion 2016-06-14 16:02:38 +02:00
eustas
514941adeb Fix implicit 32->64 bit conversion 2016-06-14 16:02:38 +02:00
eustas
7d928f470d Fix implicit 32->64 bit conversion 2016-06-14 16:02:38 +02:00
eustas
3bb19efbb8 Fix double constant literal 2016-06-14 16:02:38 +02:00
eustas
d7d59462db Backport MSVC log2 fix 2016-06-14 16:02:38 +02:00
eustas
918ddd31c1 Fix kInfinity definition
INFINITY is legal only in C++11
2016-06-14 16:02:38 +02:00
Eugene Kliuchnikov
8872d7b459 Fix CI build. 2016-06-14 16:02:38 +02:00
Eugene Kliuchnikov
3ccbf05d5f Convert encoder to plain C. 2016-06-14 16:02:38 +02:00
Eugene Kliuchnikov
c2a9b2d276 Step 3: change file extension C++ -> C
This will break the build.
2016-06-14 16:02:38 +02:00
eustas
55bd78fbca Fix test file path 2016-06-14 16:02:38 +02:00
eustas
ab53fc31f7 Fix test file path 2016-06-14 16:02:38 +02:00
eustas
98bc7fcc13 Fix test file path 2016-06-14 16:02:38 +02:00
Eugene Kliuchnikov
2a6b041a4e Update setup.py 2016-06-14 16:02:38 +02:00
Eugene Kliuchnikov
58a3023ef8 Transform most of C++ comments to C-style. 2016-06-14 16:02:38 +02:00
Eugene Kliuchnikov
f1c9ab2910 Extract common parts: constants, dictionary, etc. 2016-06-14 16:02:38 +02:00
eustas
34bfe2a6cc Add Travis status 2016-06-14 15:53:06 +02:00
eustas
22f7e3c1f6 Merge pull request #372 from eustas/to-v0.4
Fix VS compilation warnings; cleanup API.
2016-06-14 14:44:01 +02:00
eustas
e96b7db0c2 Fix: declare variables before code 2016-06-14 13:54:11 +02:00
Eugene Kliuchnikov
be1a53a61b Fix VS compilation warnings; cleanup API. 2016-06-14 13:42:47 +02:00
eustas
2b4bdc4bcd Merge pull request #371 from eustas/to-v0.4
Fix CI build.
2016-06-13 17:18:13 +02:00
eustas
48da49b097 Fix declaration / instruction mixture 2016-06-13 16:46:25 +02:00
eustas
40101bb9bf Fix implicit 32->64 bit conversion 2016-06-13 16:44:52 +02:00
eustas
85712c66d4 Fix implicit 32->64 bit conversion 2016-06-13 16:30:17 +02:00
eustas
a08c36bd86 Fix implicit 32->64 bit conversion 2016-06-13 16:28:19 +02:00
eustas
bc7f0f831d Fix double constant literal 2016-06-13 15:54:57 +02:00
eustas
8d14bed24a Backport MSVC log2 fix 2016-06-13 15:50:15 +02:00
eustas
18dc8eabe1 Fix kInfinity definition
INFINITY is legal only in C++11
2016-06-13 15:40:45 +02:00
Eugene Kliuchnikov
db3a11625d Fix CI build. 2016-06-13 15:22:13 +02:00
eustas
0747bda5e0 Merge pull request #370 from eustas/to-v0.4
Convert encoder to plain C.
2016-06-13 11:53:23 +02:00
Eugene Kliuchnikov
b972c67780 Convert encoder to plain C. 2016-06-13 11:01:04 +02:00
eustas
63111b21e8 Merge pull request #367 from google/master
Merge upstream changes
2016-06-06 12:00:07 +02:00
eustas
ae80e76c8d Merge pull request #364 from eustas/to-v0.4
Step 3: change file extension C++ -> C
2016-06-03 16:31:07 +02:00
Eugene Kliuchnikov
582ecab380 Step 3: change file extension C++ -> C
This will break the build.
2016-06-03 16:29:37 +02:00
eustas
2f87a5ae2d Merge pull request #363 from eustas/to-v0.4
Update setup.py
2016-06-03 12:41:12 +02:00
eustas
b73ebe32b4 Fix test file path 2016-06-03 12:31:02 +02:00
eustas
66606e7d43 Fix test file path 2016-06-03 12:30:45 +02:00
eustas
9dd7e38bde Fix test file path 2016-06-03 12:30:12 +02:00
Eugene Kliuchnikov
11d1337baf Update setup.py 2016-06-03 12:02:01 +02:00
eustas
5a206dd9ab Merge pull request #362 from google/master
Pick-up pyhton build script update
2016-06-03 11:36:21 +02:00
eustas
6e356105b5 Merge pull request #361 from eustas/to-v0.4
Step 2: update comments
2016-06-03 11:32:32 +02:00
Eugene Kliuchnikov
352b0b2836 Transform most of C++ comments to C-style. 2016-06-03 11:19:23 +02:00
eustas
7cdcbd72c3 Merge pull request #360 from eustas/master
Step 1: extract common
2016-06-03 10:58:24 +02:00
Eugene Kliuchnikov
028291865d Extract common parts: constants, dictionary, etc. 2016-06-03 10:51:04 +02:00
416 changed files with 89623 additions and 65201 deletions

6
.bazelignore Normal file
View File

@@ -0,0 +1,6 @@
# Exclude Bazel roots (workspaces)
c/fuzz
go
java
js
research

40
.editorconfig Normal file
View File

@@ -0,0 +1,40 @@
# http://editorconfig.org
# Consistent coding style across different editors.
# Top-most file
root = true
# Global styles:
# - indent 2 spaces
# - add final new line
# - trim trailing whitespace
[*]
charset = utf-8
end_of_line = lf
indent_size = 2
indent_style = space
insert_final_newline = true
trim_trailing_whitespace = true
# BUILD:
# - indent 4 spaces
[BUILD]
indent_size = 4
# Makefile:
# - indent 1 tab
[Makefile]
indent_size = tab
indent_style = tab
# Markdown:
# - indent 4 spaces
# - trailing whitespace is significant
[*.md]
indent_size = 4
trim_trailing_whitespace = false
# Python
# - indent 4 spaces
[*.py]
indent_size = 4

54
.gitattributes vendored Normal file
View File

@@ -0,0 +1,54 @@
tests/testdata/* binary
# Exclude everything
**/** export-ignore
# Add top-level files
.bazelignore !export-ignore
BUILD.bazel !export-ignore
CHANGELOG.md !export-ignore
CMakeLists.txt !export-ignore
CONTRIBUTING.md !export-ignore
LICENSE !export-ignore
MANIFEST.in !export-ignore
README !export-ignore
README.md !export-ignore
SECURITY.md !export-ignore
setup.cfg !export-ignore
setup.py !export-ignore
WORKSPACE.bazel !export-ignore
# Add sources
c !export-ignore
c/** !export-ignore
c/common/dictionary.bin* export-ignore
c/fuzz export-ignore
# Add man pages
docs !export-ignore
docs/** !export-ignore
docs/brotli-comparison-study-2015-09-22.pdf export-ignore
# Add python bindings + tests
python !export-ignore
python/** !export-ignore
# Add go bindings + tests
go !export-ignore
go/** !export-ignore
# Add more build files.
scripts !export-ignore
scripts/download_testdata.sh !export-ignore
scripts/libbrotli*.pc.in !export-ignore
# Add testdata
tests !export-ignore
tests/*.sh !export-ignore
tests/*.cmake !export-ignore
tests/testdata !export-ignore
tests/testdata/empty !export-ignore
tests/testdata/empty.compressed !export-ignore
tests/testdata/ukkonooa !export-ignore
tests/testdata/ukkonooa.compressed !export-ignore
tests/testdata/zerosukkanooa.compressed !export-ignore

11
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,11 @@
# Copyright 2023 Google Inc. All Rights Reserved.
#
# Distributed under MIT license.
# See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "weekly"

357
.github/workflows/build_test.yml vendored Normal file
View File

@@ -0,0 +1,357 @@
# Copyright 2021 Google Inc. All Rights Reserved.
#
# Distributed under MIT license.
# See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
# Workflow for building and running tests under Ubuntu
name: Build/Test
on:
push:
branches:
- master
pull_request:
types: [opened, reopened, labeled, unlabeled, synchronize]
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
build_test:
name: Build and test ${{ matrix.name }}
runs-on: ${{ matrix.os || 'ubuntu-latest' }}
defaults:
run:
shell: bash
strategy:
fail-fast: false
matrix:
include:
- name: cmake:gcc
build_system: cmake
c_compiler: gcc
cxx_compiler: g++
- name: cmake:gcc-old
build_system: cmake
c_compiler: gcc
cxx_compiler: g++
os: ubuntu-22.04
- name: cmake:clang
build_system: cmake
c_compiler: clang
cxx_compiler: clang
- name: cmake:clang-old
build_system: cmake
c_compiler: clang
cxx_compiler: clang
os: ubuntu-22.04
- name: cmake:package
build_system: cmake
cmake_args: -DBROTLI_BUILD_FOR_PACKAGE=ON
- name: cmake:static
build_system: cmake
cmake_args: -DBUILD_SHARED_LIBS=OFF
- name: cmake:clang:asan
build_system: cmake
sanitizer: address
c_compiler: clang
cxx_compiler: clang++
- name: cmake:clang:tsan
build_system: cmake
sanitizer: thread
c_compiler: clang
cxx_compiler: clang++
- name: cmake:clang:ubsan
build_system: cmake
sanitizer: undefined
c_compiler: clang
cxx_compiler: clang++
c_flags: -fno-sanitize-recover=undefined,integer
- name: cmake:qemu-arm-neon-gcc
build_system: cmake
c_compiler: arm-linux-gnueabihf-gcc
cxx_compiler: arm-linux-gnueabihf-g++
c_flags: -march=armv7-a -mfloat-abi=hard -mfpu=neon
extra_apt_pkgs: gcc-arm-linux-gnueabihf libc6-dev-armhf-cross qemu-user
- name: cmake-osx:clang
build_system: cmake
c_compiler: clang
cxx_compiler: clang++
os: macos-latest
- name: cmake-osx:gcc
build_system: cmake
c_compiler: gcc
cxx_compiler: g++
os: macos-latest
- name: cmake-win64:msvc-rel
build_system: cmake
cmake_generator: Visual Studio 17 2022
cmake_config: Release
os: windows-latest
- name: cmake-win64:msvc-dbg
build_system: cmake
cmake_generator: Visual Studio 17 2022
cmake_config: Debug
os: windows-latest
- name: fuzz:clang
build_system: fuzz
c_compiler: clang
cxx_compiler: clang++
- name: python3.10:clang
build_system: python
python_version: "3.10"
c_compiler: clang
cxx_compiler: clang++
- name: python3.10-win
build_system: python
python_version: "3.10"
# TODO: investigate why win-builds can't run tests
py_setuptools_cmd: build_ext
os: windows-2022
- name: maven
build_system: maven
- name: bazel:root
build_system: bazel
bazel_project: .
- name: bazel:go
build_system: bazel
bazel_project: go
- name: bazel:java
build_system: bazel
bazel_project: java
- name: bazel:research
build_system: bazel
bazel_project: research
- name: bazel-osx:root
build_system: bazel
bazel_project: .
os: macos-latest
- name: bazel-osx:go
build_system: bazel
bazel_project: go
os: macos-latest
- name: bazel-osx:java
build_system: bazel
bazel_project: java
os: macos-latest
- name: bazel-osx:research
build_system: bazel
bazel_project: research
os: macos-latest
- name: bazel-win:root
build_system: bazel
bazel_project: .
os: windows-latest
# TODO(eustas): restore when go is fixed
#- name: bazel-win:go
# build_system: bazel
# bazel_project: go
# os: windows-latest
# TODO(eustas): restore when kotlin is fixed
#- name: bazel-win:java
# build_system: bazel
# bazel_project: java
# os: windows-latest
- name: bazel-win:research
build_system: bazel
bazel_project: research
os: windows-latest
env:
CC: ${{ matrix.c_compiler || 'gcc' }}
CXX: ${{ matrix.cxx_compiler || 'gcc' }}
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: Install extra deps @ Ubuntu
if: ${{ runner.os == 'Linux' }}
# Already installed: bazel, clang{13-15}, cmake, gcc{9.5-13.1}, java{8,11,17,21}, maven, python{3.10}
run: |
EXTRA_PACKAGES="${{ matrix.extra_apt_pkgs || '' }}"
sudo apt update
sudo apt install -y ${EXTRA_PACKAGES}
- name: Checkout the source
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
submodules: false
fetch-depth: 1
- name: Configure / Build / Test with CMake
if: ${{ matrix.build_system == 'cmake' }}
run: |
export ASAN_OPTIONS=detect_leaks=0
declare -a CMAKE_OPTIONS=(${{ matrix.cmake_args || '' }})
CMAKE_OPTIONS+=("-DCMAKE_VERBOSE_MAKEFILE=ON")
[ ! -z '${{ matrix.c_compiler || '' }}' ] && CMAKE_OPTIONS+=(-DCMAKE_C_COMPILER='${{ matrix.c_compiler }}')
[ ! -z '${{ matrix.cxx_compiler || '' }}' ] && CMAKE_OPTIONS+=(-DCMAKE_CXX_COMPILER='${{ matrix.cxx_compiler }}')
[ ! -z '${{ matrix.sanitizer || '' }}' ] && CMAKE_OPTIONS+=(-DENABLE_SANITIZER='${{ matrix.sanitizer }}')
[ ! -z '${{ matrix.cmake_generator || '' }}' ] && export CMAKE_GENERATOR='${{ matrix.cmake_generator }}'
declare -a CMAKE_BUILD_OPTIONS=()
[ ! -z '${{ matrix.cmake_config || '' }}' ] && CMAKE_BUILD_OPTIONS+=(--config '${{ matrix.cmake_config }}')
declare -a CMAKE_TEST_OPTIONS=()
[ ! -z '${{ matrix.cmake_config || '' }}' ] && CMAKE_TEST_OPTIONS+=(-C '${{ matrix.cmake_config }}')
cmake -B out . ${CMAKE_OPTIONS[*]} -DCMAKE_C_FLAGS='${{ matrix.c_flags || '' }}'
cmake --build out ${CMAKE_BUILD_OPTIONS[*]}
cd out; ctest ${CMAKE_TEST_OPTIONS[*]}; cd ..
- name: Quick Fuzz
if: ${{ matrix.build_system == 'fuzz' }}
run: |
mkdir ${RUNNER_TEMP}/decode_corpora
unzip java/org/brotli/integration/fuzz_data.zip -d ${RUNNER_TEMP}/decode_corpora
cd ${GITHUB_WORKSPACE}/c/fuzz
bazelisk build --config=asan-libfuzzer :decode_fuzzer
for f in `ls ${RUNNER_TEMP}/decode_corpora`
do
echo "Testing $f"
./bazel-bin/decode_fuzzer_bin ${RUNNER_TEMP}/decode_corpora/$f
done
- name: Build with Bazel
if: ${{ matrix.build_system == 'bazel' }}
run: |
cd ${GITHUB_WORKSPACE}/${{ matrix.bazel_project }}
bazelisk build -c opt ...:all --java_runtime_version=remotejdk_21
- name: Fix symlinks for Bazel (Windows)
if: ${{ matrix.build_system == 'bazel' && runner.os == 'Windows' && matrix.bazel_project == 'java' }}
shell: python
run: |
import fnmatch
import os
import os.path
from shutil import copyfile
os.chdir('${{ matrix.bazel_project }}')
print('Searching for manifests in ' + os.getcwd())
matches = []
for root, dirnames, filenames in os.walk('bazel-bin\\org\\brotli'):
for filename in fnmatch.filter(filenames, '*.runfiles_manifest'):
matches.append(os.path.join(root, filename))
for match in matches:
print('Scanning manifest ' + match)
runfiles = match[:-len('_manifest')]
with open(match) as manifest:
for entry in manifest:
entry = entry.strip()
if not entry.startswith("_main"):
continue
if entry.startswith("_main/external"):
continue
(alias, space, link) = entry.partition(' ')
if alias.endswith('.jar') or alias.endswith('.exe'):
continue
link = link.replace('/', '\\')
alias = alias.replace('/', '\\')
dst = os.path.join(runfiles, alias)
if not os.path.exists(dst):
print(link + ' -> ' + dst)
parent = os.path.dirname(dst)
if not os.path.exists(parent):
os.makedirs(parent)
copyfile(link, dst)
print('Finished resolving symlinks')
- name: Test with Bazel
if: ${{ matrix.build_system == 'bazel' }}
run: |
cd ${GITHUB_WORKSPACE}/${{ matrix.bazel_project }}
bazelisk query "tests(...)" --output=label > ${RUNNER_TEMP}/tests.lst
[ -s ${RUNNER_TEMP}/tests.lst ] && bazelisk test -c opt ...:all --java_runtime_version=remotejdk_21
bazelisk clean
- name: Build / Test with Maven
if: ${{ matrix.build_system == 'maven' }}
run: |
export MAVEN_OPTS=-Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn
cd java/org/brotli
mvn -B install
# TODO(eustas): nuke maven build?
# cd integration
# mvn -B verify
- uses: actions/setup-python@e797f83bcb11b83ae66e0230d6156d7c80228e7c # v6.0.0
if: ${{ matrix.build_system == 'python' }}
with:
python-version: ${{ matrix.python_version }}
# TODO(eustas): use modern setuptools (split out testing)
- name: Build / Test with Python
if: ${{ matrix.build_system == 'python' }}
run: |
python -VV
python -c "import sys; sys.exit('Invalid python version') if '.'.join(map(str,sys.version_info[0:2])) != '${{ matrix.python_version }}' else True"
pip install setuptools==51.3.3
python setup.py ${{ matrix.py_setuptools_cmd || 'test'}}
build_test_py27:
name: Build and test with Python 2.7
runs-on: ubuntu-latest
container:
image: ubuntu:22.04
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: Install deps
run: |
apt update
apt install -y curl gcc python2.7 python2.7-dev
curl https://bootstrap.pypa.io/pip/2.7/get-pip.py --output get-pip.py
python2.7 get-pip.py
python2.7 -m pip install distutils-pytest==0.1
- name: Checkout the source
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
submodules: false
fetch-depth: 1
- name: Build / Test
run: |
python2.7 -VV
python2.7 -c "import sys; sys.exit('Invalid python version') if '.'.join(map(str,sys.version_info[0:2])) != '2.7' else True"
python2.7 setup.py test

70
.github/workflows/build_test_wasm.yml vendored Normal file
View File

@@ -0,0 +1,70 @@
# Copyright 2025 Google Inc. All Rights Reserved.
#
# Distributed under MIT license.
# See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
# Workflow for building and running tests with WASM
name: Build/Test WASM
on:
push:
branches:
- master
pull_request:
types: [opened, reopened, labeled, unlabeled, synchronize]
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
build_test_wasm:
name: Build and test with WASM
runs-on: ubuntu-latest
env:
CCACHE_DIR: ${{ github.workspace }}/.ccache
BUILD_TARGET: wasm32
EM_VERSION: 3.1.51
# As of 28.08.2025 ubuntu-latest is 24.04; it is shipped with node 22.18
NODE_VERSION: 22
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
submodules: true
fetch-depth: 1
- name: Install node
uses: actions/setup-node@a0853c24544627f65ddf259abe73b1d18a591444 # v5.0.0
with:
node-version: ${{env.NODE_VERSION}}
- name: Get non-EMSDK node path
run: which node >> $HOME/.base_node_path
- name: Install emsdk
uses: mymindstorm/setup-emsdk@6ab9eb1bda2574c4ddb79809fc9247783eaf9021 # v14
with:
version: ${{env.EM_VERSION}}
no-cache: true
- name: Set EMSDK node version
run: |
echo "NODE_JS='$(cat $HOME/.base_node_path)'" >> $EMSDK/.emscripten
emsdk construct_env
- name: Build
run: |
LDFLAGS=" -s ALLOW_MEMORY_GROWTH=1 -s NODERAWFS=1 " emcmake cmake -B out .
cmake --build out
cd out; ctest --output-on-failure; cd ..

81
.github/workflows/codeql.yml vendored Normal file
View File

@@ -0,0 +1,81 @@
name: "CodeQL"
on:
push:
branches: [ "master" ]
pull_request:
# The branches below must be a subset of the branches above
branches: [ "master" ]
schedule:
- cron: '18 15 * * 0'
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
analyze:
name: Analyze
runs-on: 'ubuntu-latest'
timeout-minutes: 360
permissions:
actions: read
contents: read
security-events: write
strategy:
fail-fast: false
matrix:
language: [ 'cpp', 'java', 'javascript', 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python', 'ruby', 'swift' ]
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@f443b600d91635bebf5b0d9ebc620189c0d6fba5 # v3.29.5
with:
languages: ${{ matrix.language }}
# CodeQL is currently crashing on files with large lists:
# https://github.com/github/codeql/issues/13656
config: |
paths-ignore:
- research
- js/test_data.*
- if: matrix.language == 'cpp'
name: Build CPP
uses: github/codeql-action/autobuild@f443b600d91635bebf5b0d9ebc620189c0d6fba5 # v3.29.5
- if: matrix.language == 'cpp' || matrix.language == 'java'
name: Build Java
run: |
cd ${GITHUB_WORKSPACE}/java
bazelisk build --spawn_strategy=local --nouse_action_cache -c opt ...:all
- if: matrix.language == 'javascript'
name: Build JS
uses: github/codeql-action/autobuild@f443b600d91635bebf5b0d9ebc620189c0d6fba5 # v3.29.5
- if: matrix.language == 'cpp' || matrix.language == 'python'
name: Build Python
run: |
python setup.py build_ext
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@f443b600d91635bebf5b0d9ebc620189c0d6fba5 # v3.29.5
with:
category: "/language:${{matrix.language}}"
ref: "${{ github.ref != 'master' && github.ref || '/refs/heads/master' }}"
sha: "${{ github.sha }}"

47
.github/workflows/fuzz.yml vendored Normal file
View File

@@ -0,0 +1,47 @@
# Copyright 2020 Google Inc. All Rights Reserved.
#
# Distributed under MIT license.
# See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
# Workflow for building / running oss-fuzz.
name: CIFuzz
on: [pull_request]
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
Fuzzing:
runs-on: ubuntu-latest
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: Build Fuzzers
uses: google/oss-fuzz/infra/cifuzz/actions/build_fuzzers@3e6a7fd7bcd631647ab9beed1fe0897498e6af39 # 22.09.2025
with:
oss-fuzz-project-name: 'brotli'
dry-run: false
- name: Run Fuzzers
uses: google/oss-fuzz/infra/cifuzz/actions/run_fuzzers@3e6a7fd7bcd631647ab9beed1fe0897498e6af39 # 22.09.2025
with:
oss-fuzz-project-name: 'brotli'
fuzz-seconds: 600
dry-run: false
- name: Upload Crash
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
if: failure()
with:
name: artifacts
path: ./out/artifacts

50
.github/workflows/lint.yml vendored Normal file
View File

@@ -0,0 +1,50 @@
# Copyright 2025 Google Inc. All Rights Reserved.
#
# Distributed under MIT license.
# See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
# Workflow for checking typos and buildifier, formatting, etc.
name: "Lint"
on:
push:
branches: [ "master" ]
pull_request:
branches: [ "master" ]
schedule:
- cron: '18 15 * * 0'
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
check:
name: Lint
runs-on: 'ubuntu-latest'
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: Checkout repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
- name: Install tools
run: |
eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"
brew install buildifier typos-cli
- name: Check typos
run: |
eval "$(/home/linuxbrew/.linuxbrew/bin/brew shellenv)"
./scripts/check_typos.sh
# TODO(eustas): run buildifier

245
.github/workflows/release.yaml vendored Normal file
View File

@@ -0,0 +1,245 @@
# Copyright 2023 Google Inc. All Rights Reserved.
#
# Distributed under MIT license.
# See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
# Workflow for building the release binaries.
name: Release build / deploy
on:
push:
branches:
- master
- v*.*.*
release:
types: [ published ]
pull_request:
types: [opened, reopened, labeled, unlabeled, synchronize]
permissions:
contents: read
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
windows_build:
name: Windows Build (vcpkg / ${{ matrix.triplet }})
runs-on: ${{ matrix.runs_on }}
strategy:
fail-fast: false
matrix:
include:
- triplet: x86-windows-dynamic
arch: '-A Win32'
build_shared_libs: 'ON'
runs_on: windows-latest
- triplet: x64-windows-dynamic
arch: '-A x64'
build_shared_libs: 'ON'
runs_on: windows-latest
- triplet: arm64-windows-dynamic
arch: '-A ARM64'
build_shared_libs: 'ON'
runs_on: windows-11-arm
- triplet: x86-windows-static
arch: '-A Win32'
build_shared_libs: 'OFF'
runs_on: windows-latest
- triplet: x64-windows-static
arch: '-A x64'
build_shared_libs: 'OFF'
runs_on: windows-latest
- triplet: arm64-windows-static
arch: '-A ARM64'
build_shared_libs: 'OFF'
runs_on: windows-11-arm
env:
VCPKG_VERSION: '2022.11.14'
VCPKG_ROOT: vcpkg
VCPKG_DISABLE_METRICS: 1
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: Checkout the source
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
submodules: false
fetch-depth: 1
- uses: actions/cache@0057852bfaa89a56745cba8c7296529d2fc39830 # v4.3.0
id: cache-vcpkg
with:
path: vcpkg
key: release-${{ runner.os }}-vcpkg-${{ env.VCPKG_VERSION }}-${{ matrix.triplet }}
- name: Download vcpkg
if: steps.cache-vcpkg.outputs.cache-hit != 'true'
shell: 'powershell'
run: |
Invoke-WebRequest -Uri "https://github.com/microsoft/vcpkg/archive/refs/tags/${{ env.VCPKG_VERSION }}.zip" -OutFile "vcpkg.zip"
- name: Bootstrap vcpkg
if: steps.cache-vcpkg.outputs.cache-hit != 'true'
shell: 'bash'
run: |
set -x
unzip -q vcpkg.zip
rm -rf ${VCPKG_ROOT}
mv vcpkg-${VCPKG_VERSION} ${VCPKG_ROOT}
${VCPKG_ROOT}/bootstrap-vcpkg.sh
- name: Configure
shell: 'bash'
run: |
set -x
mkdir out
cmake -Bout -H. ${{ matrix.arch }} \
-DBUILD_TESTING=OFF \
-DBUILD_SHARED_LIBS=${{ matrix.build_shared_libs }} \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=`pwd`/prefix \
-DCMAKE_TOOLCHAIN_FILE=${VCPKG_ROOT}/scripts/buildsystems/vcpkg.cmake \
-DVCPKG_TARGET_TRIPLET=${{ matrix.triplet }} \
#
- name: Build
shell: 'bash'
run: |
set -x
cmake --build out --config Release
- name: Install
shell: 'bash'
run: |
set -x
cmake --build out --config Release --target install
cp LICENSE prefix/bin/LICENSE.brotli
- name: Package release zip
shell: 'powershell'
run: |
Compress-Archive -Path prefix\bin\* `
-DestinationPath brotli-${{matrix.triplet}}.zip
- name: Upload package
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: brotli-${{matrix.triplet}}
path: brotli-${{matrix.triplet}}.zip
compression-level: 0
testdata_upload:
name: Upload testdata
runs-on: 'ubuntu-latest'
defaults:
run:
shell: bash
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: Checkout the source
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
submodules: false
fetch-depth: 1
- name: Compress testdata
run: |
tar cvfJ testdata.txz tests/testdata
- name: Upload archive
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: testdata
path: testdata.txz
compression-level: 0
publish_release_assets:
name: Publish release assets
needs: [windows_build, testdata_upload]
if: github.event_name == 'release'
runs-on: [ubuntu-latest]
permissions:
contents: write
steps:
- name: Checkout the source
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
submodules: false
fetch-depth: 1
- name: Download all artifacts
uses: actions/download-artifact@634f93cb2916e3fdff6788551b99b062d0335ce0 # v5.0.0
with:
path: release_assets
merge-multiple: true
- name: Publish assets
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
gh release upload ${{ github.event.release.tag_name }} ./release_assets/*
archive_build:
needs: publish_release_assets
name: Build and test from archive
runs-on: 'ubuntu-latest'
defaults:
run:
shell: bash
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: Checkout the source
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
submodules: false
fetch-depth: 1
- name: Archive
run: |
git archive HEAD -o archive.tgz
- name: Pick tag
run: |
echo "BROTLI_TAG=$(git describe --tags --abbrev=0)" >> $GITHUB_ENV
- name: Extract
run: |
mkdir archive
cd archive
tar xvzf ../archive.tgz
- name: Download testdata
run: |
cd archive
scripts/download_testdata.sh
- name: Configure and Build
run: |
cd archive
cmake -B out .
cmake --build out
- name: Test
run: |
cd archive
cd out
ctest

82
.github/workflows/scorecard.yml vendored Normal file
View File

@@ -0,0 +1,82 @@
# This workflow uses actions that are not certified by GitHub. They are provided
# by a third-party and are governed by separate terms of service, privacy
# policy, and support documentation.
name: Scorecard supply-chain security
on:
# For Branch-Protection check. Only the default branch is supported. See
# https://github.com/ossf/scorecard/blob/main/docs/checks.md#branch-protection
branch_protection_rule:
# To guarantee Maintained check is occasionally updated. See
# https://github.com/ossf/scorecard/blob/main/docs/checks.md#maintained
schedule:
- cron: '23 21 * * 1'
push:
branches: [ "master" ]
# Declare default permissions as read only.
permissions: read-all
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
jobs:
analysis:
name: Scorecard analysis
runs-on: ubuntu-latest
permissions:
# Needed to upload the results to code-scanning dashboard.
security-events: write
# Needed to publish results and get a badge (see publish_results below).
id-token: write
# Uncomment the permissions below if installing in a private repository.
# contents: read
# actions: read
steps:
- name: Harden Runner
uses: step-security/harden-runner@f4a75cfd619ee5ce8d5b864b0d183aff3c69b55a # v2.13.1
with:
egress-policy: audit
- name: "Checkout code"
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
persist-credentials: false
- name: "Run analysis"
uses: ossf/scorecard-action@4eaacf0543bb3f2c246792bd56e8cdeffafb205a # v2.4.3
with:
results_file: results.sarif
results_format: sarif
# (Optional) "write" PAT token. Uncomment the `repo_token` line below if:
# - you want to enable the Branch-Protection check on a *public* repository, or
# - you are installing Scorecard on a *private* repository
# To create the PAT, follow the steps in https://github.com/ossf/scorecard-action#authentication-with-pat.
# repo_token: ${{ secrets.SCORECARD_TOKEN }}
# Public repositories:
# - Publish results to OpenSSF REST API for easy access by consumers
# - Allows the repository to include the Scorecard badge.
# - See https://github.com/ossf/scorecard-action#publishing-results.
# For private repositories:
# - `publish_results` will always be set to `false`, regardless
# of the value entered here.
publish_results: true
# Upload the results as artifacts (optional). Commenting out will disable uploads of run results in SARIF
# format to the repository Actions tab.
- name: "Upload artifact"
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
with:
name: SARIF file
path: results.sarif
retention-days: 5
# Upload the results to GitHub's code scanning dashboard.
- name: "Upload to code-scanning"
uses: github/codeql-action/upload-sarif@17783bfb99b07f70fae080b654aed0c514057477 # v2.23.3
with:
sarif_file: results.sarif

24
.gitignore vendored
View File

@@ -1,8 +1,20 @@
# C
*.d
*.o
*.pyc
*.txt.uncompressed
*.bro
*.unbro
tools/bro
build/
*.obj
bin/
buildfiles/
**/obj/
dist/
**/bazel-*
# Python
__pycache__/
*.py[cod]
*.so
*.egg-info/
# Tests
*.txt.uncompressed
*.br
*.unbr

3
.gitmodules vendored
View File

@@ -1,3 +0,0 @@
[submodule "terryfy"]
path = terryfy
url = https://github.com/MacPython/terryfy.git

View File

@@ -1,32 +0,0 @@
language:
- objective-c
env:
matrix:
- INSTALL_TYPE='system' VERSION=2.7
- INSTALL_TYPE='macpython' VERSION=2.7.10 CC=clang CXX=clang++
- INSTALL_TYPE='macpython' VERSION=3.4.3 CC=clang CXX=clang++
- INSTALL_TYPE='macpython' VERSION=3.5.0 CC=clang CXX=clang++
- INSTALL_TYPE='homebrew' VERSION=2.7.10
- INSTALL_TYPE='homebrew' VERSION=3.4.3
- INSTALL_TYPE='homebrew' VERSION=3.5.0
install:
- source terryfy/travis_tools.sh
- get_python_environment $INSTALL_TYPE $VERSION venv
- pip install --upgrade wheel
script:
- python setup.py build_ext test
after_success:
- pip wheel -w dist .
before_deploy:
- export WHEELS=$(ls ./dist/*.whl)
deploy:
provider: releases
api_key:
secure: YcCBi6W/w4dtKCa59Wfm8L5lGWvK7KxaFNDr3yh1Hz5aStXXf758pEMHGewnlbfbwuj5a3SjBb1nLp1M69OQJfxm442uXBaBKo52PM9PPbD7NjvbNIso73pqcSODXQXKuZxDFpEhfuDTVq3hUkUqiwhChWhrFucJsSL51i7qSss=
file:
- "${WHEELS}"
skip_cleanup: true
on:
repo: google/brotli
tags: true

141
BUILD.bazel Normal file
View File

@@ -0,0 +1,141 @@
# Description:
# Brotli is a generic-purpose lossless compression algorithm.
package(
default_visibility = ["//visibility:public"],
)
licenses(["notice"]) # MIT
exports_files(["LICENSE"])
config_setting(
name = "clang-cl",
flag_values = {
"@bazel_tools//tools/cpp:compiler": "clang-cl",
},
visibility = ["//visibility:public"],
)
config_setting(
name = "msvc",
flag_values = {
"@bazel_tools//tools/cpp:compiler": "msvc-cl",
},
visibility = ["//visibility:public"],
)
STRICT_C_OPTIONS = select({
":msvc": [],
":clang-cl": [
"/W4",
"-Wconversion",
"-Wlong-long",
"-Wmissing-declarations",
"-Wmissing-prototypes",
"-Wno-strict-aliasing",
"-Wshadow",
"-Wsign-compare",
"-Wno-sign-conversion",
],
"//conditions:default": [
"--pedantic-errors",
"-Wall",
"-Wconversion",
"-Werror",
"-Wextra",
"-Wlong-long",
"-Wmissing-declarations",
"-Wmissing-prototypes",
"-Wno-strict-aliasing",
"-Wshadow",
"-Wsign-compare",
],
})
filegroup(
name = "public_headers",
srcs = glob(["c/include/brotli/*.h"]),
)
filegroup(
name = "common_headers",
srcs = glob(["c/common/*.h"]),
)
filegroup(
name = "common_sources",
srcs = glob(["c/common/*.c"]),
)
filegroup(
name = "dec_headers",
srcs = glob(["c/dec/*.h"]),
)
filegroup(
name = "dec_sources",
srcs = glob(["c/dec/*.c"]),
)
filegroup(
name = "enc_headers",
srcs = glob(["c/enc/*.h"]),
)
filegroup(
name = "enc_sources",
srcs = glob(["c/enc/*.c"]),
)
cc_library(
name = "brotli_inc",
hdrs = [":public_headers"],
copts = STRICT_C_OPTIONS,
strip_include_prefix = "c/include",
)
cc_library(
name = "brotlicommon",
srcs = [":common_sources"],
hdrs = [":common_headers"],
copts = STRICT_C_OPTIONS,
deps = [":brotli_inc"],
)
cc_library(
name = "brotlidec",
srcs = [":dec_sources"],
hdrs = [":dec_headers"],
copts = STRICT_C_OPTIONS,
deps = [":brotlicommon"],
)
cc_library(
name = "brotlienc",
srcs = [":enc_sources"],
hdrs = [":enc_headers"],
copts = STRICT_C_OPTIONS,
linkopts = select({
":clang-cl": [],
":msvc": [],
"//conditions:default": ["-lm"],
}),
deps = [":brotlicommon"],
)
cc_binary(
name = "brotli",
srcs = ["c/tools/brotli.c"],
copts = STRICT_C_OPTIONS,
linkstatic = 1,
deps = [
":brotlidec",
":brotlienc",
],
)
filegroup(
name = "dictionary",
srcs = ["c/common/dictionary.bin"],
)

291
CHANGELOG.md Normal file
View File

@@ -0,0 +1,291 @@
# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## Unreleased
## [1.2.0] - 2025-10-xx
### SECURITY
- python: added `Decompressor::can_accept_more_data` method and optional
`output_buffer_limit` argument `Decompressor::process`;
that allows mitigation of unexpectedly large output;
reported by Charles Chan (https://github.com/charleswhchan)
### Added
- **decoder / encoder: added static initialization to reduce binary size**
- python: allow limiting decoder output (see SECURITY section)
- CLI: `brcat` alias; allow decoding concatenated brotli streams
- kt: pure Kotlin decoder
- cgo: support "raw" dictionaries
- build: Bazel modules
### Removed
- java: dropped `finalize()` for native entities
### Fixed
- java: in `compress` pass correct length to native encoder
### Improved
- build: install man pages
- build: updated / fixed / refined Bazel buildfiles
- encoder: faster encoding
- cgo: link via pkg-config
- python: modernize extension / allow multi-phase module initialization
### Changed
- decoder / encoder: static tables use "small" model (allows 2GiB+ binaries)
## [1.1.0] - 2023-08-28
### Added
- decoder: `BrotliDecoderAttachDictionary`
- decoder: `BrotliDecoderOnFinish` callback behind `BROTLI_REPORTING`
- decoder: `BrotliDecoderSetMetadataCallbacks`
- encoder: `BrotliEncoderPrepareDictionary`,
`BrotliEncoderDestroyPreparedDictionary`,
`BrotliEncoderAttachPreparedDictionary`
- decoder: `BrotliEncoderOnFinish` callback behind `BROTLI_REPORTING`
- common: `BrotliSharedDictionaryCreateInstance`,
`BrotliSharedDictionaryDestroyInstance`,
`BrotliSharedDictionaryAttach`
- CLI: `--dictionary` option
- java: encoder wrapper: `Parameters.mode`
- java: `Brotli{Input|Output}Stream.attachDictionary`
- java: wrapper: partial byte array input
- typescript: decoder (transpiled from Java)
### Removed
- build: `BROTLI_BUILD_PORTABLE` option
### Fixed
- java: JNI decoder failed sometimes on power of 2 payloads
### Improved
- java / js: smaller decoder footprint
- decoder: faster decoding
- encoder: faster encoding
- encoder: smaller stack frames
## [1.0.9] - 2020-08-27
Re-release of 1.0.8.
## [1.0.8] - 2020-08-27
### SECURITY
- CVE-2020-8927: potential overflow when input chunk is >2GiB
### Added
- encoder: `BROTLI_PARAM_STREAM_OFFSET`
### Improved
- CLI: better reporting
- CLI: workaround for "lying feof"
- java: faster decoding
- java: support "large window"
- encoder: use less memory
- release: filter sources for the tarball
## [1.0.7] - 2018-10-23
### Improved
- decoder: faster decoding on ARM CPU
## [1.0.6] - 2018-09-13
### Fixed
- build: AutoMake and CMake build
- java: JDK 8<->9 incompatibility
## [1.0.5] - 2018-06-27
### Added
- scripts: extraction of static dictionary from RFC
### Improved
- encoder: better compression at quality 1
- encoder: better compression with "large window"
## [1.0.4] - 2018-03-29
### Added
- encoder: `BROTLI_PARAM_NPOSTFIX`, `BROTLI_PARAM_NDIRECT`
- CLI: `--large_window` option
### Improved
- encoder: better compression
## [1.0.3] - 2018-03-02
### Added
- decoder: `BROTLI_DECODER_PARAM_LARGE_WINDOW` enum
- encoder: `BROTLI_PARAM_LARGE_WINDOW` enum
- java: `BrotliInputStream.setEager`
### Fixed
- build: AutoMake build in some environments
- encoder: fix one-shot q=10 1-byte input compression
### Improved
- encoder: better font compression
## [1.0.2] - 2017-11-28
### Added
- build: AutoMake
- research: better dictionary generators
## [1.0.1] - 2017-09-22
### Changed
- clarifications in `README.md`
## [1.0.0] - 2017-09-20
### Added
- decoder: `BrotliDecoderSetParameter`
- csharp: decoder (transpiled from Java)
- java: JNI wrappers
- javascript: decoder (transpiled from Java)
- python: streaming decompression
- research: dictionary generator
### Changed
- CLI: rename `bro` to `brotli`
### Removed
- decoder: `BrotliDecoderSetCustomDictionary`
- encoder: `BrotliEncoderSetCustomDictionary`
### Improved
- java: faster decoding
- encoder: faster compression
## [0.6.0] - 2017-04-10
### Added
- CLI: `--no-copy-stat option
- java: pure java decoder
- build: fuzzers
- research: `brotlidump` tool to explore brotli streams
- go: wrapper
### Removed
- decoder: API with plain `Brotli` prefix
### Deprecated
- encoder: `BrotliEncoderInputBlockSize`, `BrotliEncoderCopyInputToRingBuffer`,
`BrotliEncoderWriteData`
### Improved
- encoder: faster compression
- encoder: denser compression
- decoder: faster decompression
- python: release GIL
- python: use zero-copy API
## [0.5.2] - 2016-08-11
### Added
- common: `BROTLI_BOOL`, `BROTLI_TRUE`, `BROTLI_FALSE`
- decoder: API with `BrotliDecoder` prefix instead of plain `Brotli`
- build: Bazel, CMake
### Deprecated
- decoder: API with plain `Brotli` prefix
### Changed
- boolean argument / result types are re-branded as `BROTLI_BOOL`
### Improved
- build: reduced amount of warnings in various build environments
- encoder: faster compression
- encoder: lower memory usage
## [0.5.0] - 2016-06-15
### Added
- common: library has been assembled from shared parts of decoder and encoder
- encoder: C API
### Removed
- encoder: C++ API
## [0.4.0] - 2016-06-14
### Added
- encoder: faster compression modes (quality 0 and 1)
- decoder: `BrotliGetErrorCode`, `BrotliErrorString` and
`BROTLI_ERROR_CODES_LIST`
### Removed
- decoder: deprecated streaming API (using `BrotliInput`)
### Fixed
- decoder: possible pointer underflow
### Improved
- encoder: faster compression
## [0.3.0] - 2015-12-22
### LICENSE
License have been upgraded to more permissive MIT.
### Added
- CLI: `--window` option
- `tools/version.h` file
- decoder: low level streaming API
- decoder: custom memory manager API
### Deprecated
- decoder: streaming API using `BrotliInput` struct
### Fixed
- decoder: processing of uncompressed blocks
- encoder: possible division by zero
### Improved
- encoder: faster decompression
- build: more portable builds for various CPU architectures
## [0.2.0] - 2015-09-01
### Added
- CLI: `--verbose` and `--repeat` options
### Fixed
- decoder: processing of uncompressed blocks
- encoder: block stitching on quality 10 / 11
### Improved
- build: CI/CD integration
- build: better test coverage
- encoder: better compression of UTF-8 content
- encoder: faster decompression
## [0.1.0] - 2015-08-11
Initial release.

395
CMakeLists.txt Normal file
View File

@@ -0,0 +1,395 @@
# Available CMake versions:
# - Ubuntu 20.04 LTS : 3.16.3
# - Solaris 11.4 SRU 15 : 3.15
cmake_minimum_required(VERSION 3.15)
# Since this project's version is loaded from other files, this policy
# will help suppress the warning generated by cmake.
# This policy is set because we can't provide "VERSION" in "project" command.
# Use `cmake --help-policy CMP0048` for more information.
cmake_policy(SET CMP0048 NEW)
project(brotli C)
if (NOT CMAKE_BUILD_TYPE AND NOT CMAKE_CONFIGURATION_TYPES)
message(STATUS "Setting build type to Release as none was specified")
set(CMAKE_BUILD_TYPE "Release" CACHE STRING "Choose the type of build" FORCE)
else()
message(STATUS "Build type is '${CMAKE_BUILD_TYPE}'")
endif()
include(CheckCSourceCompiles)
check_c_source_compiles(
"#if defined(__EMSCRIPTEN__)
int main() {return 0;}
#endif"
BROTLI_EMSCRIPTEN
)
if (BROTLI_EMSCRIPTEN)
message("-- Compiler is EMSCRIPTEN")
else()
message("-- Compiler is not EMSCRIPTEN")
endif()
if (BROTLI_EMSCRIPTEN)
message(STATUS "Switching to static build for EMSCRIPTEN")
set(BUILD_SHARED_LIBS OFF)
endif()
# Reflect CMake variable as a build option.
option(BUILD_SHARED_LIBS "Build shared libraries" ON)
set(BROTLI_BUILD_TOOLS ON CACHE BOOL "Build/install CLI tools")
set(BROTLI_BUILD_FOR_PACKAGE OFF CACHE BOOL "Build/install both shared and static libraries")
if (BROTLI_BUILD_FOR_PACKAGE AND NOT BUILD_SHARED_LIBS)
message(FATAL_ERROR "Both BROTLI_BUILD_FOR_PACKAGE and BUILD_SHARED_LIBS are set")
endif()
# If Brotli is being bundled in another project, we don't want to
# install anything. However, we want to let people override this, so
# we'll use the BROTLI_BUNDLED_MODE variable to let them do that; just
# set it to OFF in your project before you add_subdirectory(brotli).
get_directory_property(BROTLI_PARENT_DIRECTORY PARENT_DIRECTORY)
if (NOT DEFINED BROTLI_BUNDLED_MODE)
# Bundled mode hasn't been set one way or the other, set the default
# depending on whether or not we are the top-level project.
if (BROTLI_PARENT_DIRECTORY)
set(BROTLI_BUNDLED_MODE ON)
else()
set(BROTLI_BUNDLED_MODE OFF)
endif()
endif() # BROTLI_BUNDLED_MODE
mark_as_advanced(BROTLI_BUNDLED_MODE)
include(GNUInstallDirs)
# Reads macro from .h file; it is expected to be a single-line define.
function(read_macro PATH MACRO OUTPUT)
file(STRINGS ${PATH} _line REGEX "^#define +${MACRO} +(.+)$")
string(REGEX REPLACE "^#define +${MACRO} +(.+)$" "\\1" _val "${_line}")
set(${OUTPUT} ${_val} PARENT_SCOPE)
endfunction(read_macro)
# Version information
read_macro("c/common/version.h" "BROTLI_VERSION_MAJOR" BROTLI_VERSION_MAJOR)
read_macro("c/common/version.h" "BROTLI_VERSION_MINOR" BROTLI_VERSION_MINOR)
read_macro("c/common/version.h" "BROTLI_VERSION_PATCH" BROTLI_VERSION_PATCH)
set(BROTLI_VERSION "${BROTLI_VERSION_MAJOR}.${BROTLI_VERSION_MINOR}.${BROTLI_VERSION_PATCH}")
mark_as_advanced(BROTLI_VERSION BROTLI_VERSION_MAJOR BROTLI_VERSION_MINOR BROTLI_VERSION_PATCH)
# ABI Version information
read_macro("c/common/version.h" "BROTLI_ABI_CURRENT" BROTLI_ABI_CURRENT)
read_macro("c/common/version.h" "BROTLI_ABI_REVISION" BROTLI_ABI_REVISION)
read_macro("c/common/version.h" "BROTLI_ABI_AGE" BROTLI_ABI_AGE)
math(EXPR BROTLI_ABI_COMPATIBILITY "${BROTLI_ABI_CURRENT} - ${BROTLI_ABI_AGE}")
mark_as_advanced(BROTLI_ABI_CURRENT BROTLI_ABI_REVISION BROTLI_ABI_AGE BROTLI_ABI_COMPATIBILITY)
if (ENABLE_SANITIZER)
set(CMAKE_C_FLAGS " ${CMAKE_C_FLAGS} -fsanitize=${ENABLE_SANITIZER}")
set(CMAKE_CXX_FLAGS " ${CMAKE_CXX_FLAGS} -fsanitize=${ENABLE_SANITIZER}")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fsanitize=${ENABLE_SANITIZER}")
endif ()
include(CheckLibraryExists)
set(LIBM_LIBRARY)
set(LIBM_DEP)
CHECK_LIBRARY_EXISTS(m log2 "" HAVE_LIB_M)
if (HAVE_LIB_M)
set(LIBM_LIBRARY "m")
if (NOT BUILD_SHARED_LIBS)
set(LIBM_DEP "-lm")
endif()
endif()
set(BROTLI_INCLUDE_DIRS "${CMAKE_CURRENT_SOURCE_DIR}/c/include")
mark_as_advanced(BROTLI_INCLUDE_DIRS)
if (BROTLI_BUILD_FOR_PACKAGE)
set(BROTLI_SHARED_LIBRARIES brotlienc brotlidec brotlicommon)
set(BROTLI_STATIC_LIBRARIES brotlienc-static brotlidec-static brotlicommon-static)
set(BROTLI_LIBRARIES ${BROTLI_SHARED_LIBRARIES} ${LIBM_LIBRARY})
else() # NOT BROTLI_BUILD_FOR_PACKAGE
if (BUILD_SHARED_LIBS)
set(BROTLI_SHARED_LIBRARIES brotlienc brotlidec brotlicommon)
set(BROTLI_STATIC_LIBRARIES)
else() # NOT BUILD_SHARED_LIBS
set(BROTLI_SHARED_LIBRARIES)
set(BROTLI_STATIC_LIBRARIES brotlienc brotlidec brotlicommon)
endif()
set(BROTLI_LIBRARIES ${BROTLI_SHARED_LIBRARIES} ${BROTLI_STATIC_LIBRARIES} ${LIBM_LIBRARY})
endif() # BROTLI_BUILD_FOR_PACKAGE
mark_as_advanced(BROTLI_LIBRARIES)
if (MSVC)
message(STATUS "Defining _CRT_SECURE_NO_WARNINGS to avoid warnings about security")
add_definitions(-D_CRT_SECURE_NO_WARNINGS)
endif()
file(GLOB_RECURSE BROTLI_COMMON_SOURCES RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} c/common/*.c)
file(GLOB_RECURSE BROTLI_DEC_SOURCES RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} c/dec/*.c)
file(GLOB_RECURSE BROTLI_ENC_SOURCES RELATIVE ${CMAKE_CURRENT_SOURCE_DIR} c/enc/*.c)
add_library(brotlicommon ${BROTLI_COMMON_SOURCES})
add_library(brotlidec ${BROTLI_DEC_SOURCES})
add_library(brotlienc ${BROTLI_ENC_SOURCES})
if (BROTLI_BUILD_FOR_PACKAGE)
add_library(brotlicommon-static STATIC ${BROTLI_COMMON_SOURCES})
add_library(brotlidec-static STATIC ${BROTLI_DEC_SOURCES})
add_library(brotlienc-static STATIC ${BROTLI_ENC_SOURCES})
endif()
# Older CMake versions does not understand INCLUDE_DIRECTORIES property.
include_directories(${BROTLI_INCLUDE_DIRS})
if (BUILD_SHARED_LIBS)
foreach(lib ${BROTLI_SHARED_LIBRARIES})
target_compile_definitions(${lib} PUBLIC "BROTLI_SHARED_COMPILATION" )
string(TOUPPER "${lib}" LIB)
set_target_properties (${lib} PROPERTIES DEFINE_SYMBOL "${LIB}_SHARED_COMPILATION")
endforeach()
endif() # BUILD_SHARED_LIBS
foreach(lib ${BROTLI_SHARED_LIBRARIES} ${BROTLI_STATIC_LIBRARIES})
target_link_libraries(${lib} ${LIBM_LIBRARY})
set_property(TARGET ${lib} APPEND PROPERTY INCLUDE_DIRECTORIES ${BROTLI_INCLUDE_DIRS})
set_target_properties(${lib} PROPERTIES
VERSION "${BROTLI_ABI_COMPATIBILITY}.${BROTLI_ABI_AGE}.${BROTLI_ABI_REVISION}"
SOVERSION "${BROTLI_ABI_COMPATIBILITY}")
if (NOT BROTLI_EMSCRIPTEN)
set_target_properties(${lib} PROPERTIES POSITION_INDEPENDENT_CODE TRUE)
endif()
set_property(TARGET ${lib} APPEND PROPERTY INTERFACE_INCLUDE_DIRECTORIES "$<BUILD_INTERFACE:${BROTLI_INCLUDE_DIRS}>")
endforeach() # BROTLI_xxx_LIBRARIES
target_link_libraries(brotlidec brotlicommon)
target_link_libraries(brotlienc brotlicommon)
# For projects stuck on older versions of CMake, this will set the
# BROTLI_INCLUDE_DIRS and BROTLI_LIBRARIES variables so they still
# have a relatively easy way to use Brotli:
#
# include_directories(${BROTLI_INCLUDE_DIRS})
# target_link_libraries(foo ${BROTLI_LIBRARIES})
if (BROTLI_PARENT_DIRECTORY)
set(BROTLI_INCLUDE_DIRS "${BROTLI_INCLUDE_DIRS}" PARENT_SCOPE)
set(BROTLI_LIBRARIES "${BROTLI_LIBRARIES}" PARENT_SCOPE)
endif()
# Build the brotli executable
if (BROTLI_BUILD_TOOLS)
add_executable(brotli c/tools/brotli.c)
target_link_libraries(brotli ${BROTLI_LIBRARIES})
endif()
# Installation
if (NOT BROTLI_BUNDLED_MODE)
if (BROTLI_BUILD_TOOLS)
install(
TARGETS brotli
RUNTIME DESTINATION "${CMAKE_INSTALL_BINDIR}"
)
endif()
install(
TARGETS ${BROTLI_SHARED_LIBRARIES} ${BROTLI_STATIC_LIBRARIES}
ARCHIVE DESTINATION "${CMAKE_INSTALL_LIBDIR}"
LIBRARY DESTINATION "${CMAKE_INSTALL_LIBDIR}"
RUNTIME DESTINATION "${CMAKE_INSTALL_BINDIR}"
)
install(
DIRECTORY ${BROTLI_INCLUDE_DIRS}/brotli
DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}"
)
endif() # BROTLI_BUNDLED_MODE
# Tests
# Integration tests, those depend on `brotli` binary
if (NOT BROTLI_DISABLE_TESTS AND BROTLI_BUILD_TOOLS)
# If we're targeting Windows but not running on Windows, we need Wine
# to run the tests...
if (WIN32 AND NOT CMAKE_HOST_WIN32)
find_program(BROTLI_WRAPPER NAMES wine)
if (NOT BROTLI_WRAPPER)
message(STATUS "wine not found, disabling tests")
set(BROTLI_DISABLE_TESTS TRUE)
endif()
endif() # WIN32 emulation
if (BROTLI_EMSCRIPTEN)
find_program(BROTLI_WRAPPER NAMES node)
if (NOT BROTLI_WRAPPER)
message(STATUS "node not found, disabling tests")
set(BROTLI_DISABLE_TESTS TRUE)
endif()
endif() # BROTLI_EMSCRIPTEN
endif() # BROTLI_DISABLE_TESTS
# NB: BROTLI_DISABLE_TESTS might have changed.
if (NOT BROTLI_DISABLE_TESTS AND BROTLI_BUILD_TOOLS)
# If our compiler is a cross-compiler that we know about (arm/aarch64),
# then we need to use qemu to execute the tests.
if ("${CMAKE_C_COMPILER}" MATCHES "^.*/arm-linux-gnueabihf-.*$")
message(STATUS "Detected arm-linux-gnueabihf cross-compilation")
set(BROTLI_WRAPPER "qemu-arm")
set(BROTLI_WRAPPER_LD_PREFIX "/usr/arm-linux-gnueabihf")
endif()
if ("${CMAKE_C_COMPILER}" MATCHES "^.*/arm-linux-gnueabi-.*$")
message(STATUS "Detected arm-linux-gnueabi cross-compilation")
set(BROTLI_WRAPPER "qemu-arm")
set(BROTLI_WRAPPER_LD_PREFIX "/usr/arm-linux-gnueabi")
endif()
if ("${CMAKE_C_COMPILER}" MATCHES "^.*/aarch64-linux-gnu-.*$")
message(STATUS "Detected aarch64-linux-gnu cross-compilation")
set(BROTLI_WRAPPER "qemu-aarch64")
set(BROTLI_WRAPPER_LD_PREFIX "/usr/aarch64-linux-gnu")
endif()
include(CTest)
enable_testing()
set(ROUNDTRIP_INPUTS
tests/testdata/alice29.txt
tests/testdata/asyoulik.txt
tests/testdata/lcet10.txt
tests/testdata/plrabn12.txt
c/enc/encode.c
c/common/dictionary.h
c/dec/decode.c)
foreach(INPUT ${ROUNDTRIP_INPUTS})
get_filename_component(OUTPUT_NAME "${INPUT}" NAME)
set(OUTPUT_FILE "${CMAKE_CURRENT_BINARY_DIR}/${OUTPUT_NAME}")
set(INPUT_FILE "${CMAKE_CURRENT_SOURCE_DIR}/${INPUT}")
if (EXISTS "${INPUT_FILE}")
foreach(quality 1 6 9 11)
add_test(NAME "${BROTLI_TEST_PREFIX}roundtrip/${INPUT}/${quality}"
COMMAND "${CMAKE_COMMAND}"
-DBROTLI_WRAPPER=${BROTLI_WRAPPER}
-DBROTLI_WRAPPER_LD_PREFIX=${BROTLI_WRAPPER_LD_PREFIX}
-DBROTLI_CLI=$<TARGET_FILE:brotli>
-DQUALITY=${quality}
-DINPUT=${INPUT_FILE}
-DOUTPUT=${OUTPUT_FILE}.${quality}
-P ${CMAKE_CURRENT_SOURCE_DIR}/tests/run-roundtrip-test.cmake)
endforeach()
else()
message(NOTICE "Test file ${INPUT} does not exist; OK on tarball builds; consider running scripts/download_testdata.sh before configuring.")
endif()
endforeach()
file(GLOB_RECURSE
COMPATIBILITY_INPUTS
RELATIVE ${CMAKE_CURRENT_SOURCE_DIR}
tests/testdata/*.compressed*)
foreach(INPUT ${COMPATIBILITY_INPUTS})
string(REGEX REPLACE "([a-zA-Z0-9\\.]+)\\.compressed(\\.[0-9]+)?$" "\\1" UNCOMPRESSED_INPUT "${INPUT}")
if (EXISTS ${UNCOMPRESSED_INPUT})
add_test(NAME "${BROTLI_TEST_PREFIX}compatibility/${INPUT}"
COMMAND "${CMAKE_COMMAND}"
-DBROTLI_WRAPPER=${BROTLI_WRAPPER}
-DBROTLI_WRAPPER_LD_PREFIX=${BROTLI_WRAPPER_LD_PREFIX}
-DBROTLI_CLI=$<TARGET_FILE:brotli>
-DINPUT=${CMAKE_CURRENT_SOURCE_DIR}/${INPUT}
-P ${CMAKE_CURRENT_SOURCE_DIR}/tests/run-compatibility-test.cmake)
endif()
endforeach()
endif() # BROTLI_DISABLE_TESTS
# Generate a pkg-config files
function(generate_pkg_config_path outvar path)
string(LENGTH "${path}" path_length)
set(path_args ${ARGV})
list(REMOVE_AT path_args 0 1)
list(LENGTH path_args path_args_remaining)
set("${outvar}" "${path}")
while(path_args_remaining GREATER 1)
list(GET path_args 0 name)
list(GET path_args 1 value)
get_filename_component(value_full "${value}" ABSOLUTE)
string(LENGTH "${value}" value_length)
if (path_length EQUAL value_length AND path STREQUAL value)
set("${outvar}" "\${${name}}")
break()
elseif (path_length GREATER value_length)
# We might be in a subdirectory of the value, but we have to be
# careful about a prefix matching but not being a subdirectory
# (for example, /usr/lib64 is not a subdirectory of /usr/lib).
# We'll do this by making sure the next character is a directory
# separator.
string(SUBSTRING "${path}" ${value_length} 1 sep)
if (sep STREQUAL "/")
string(SUBSTRING "${path}" 0 ${value_length} s)
if (s STREQUAL value)
string(SUBSTRING "${path}" "${value_length}" -1 suffix)
set("${outvar}" "\${${name}}${suffix}")
break()
endif()
endif()
endif()
list(REMOVE_AT path_args 0 1)
list(LENGTH path_args path_args_remaining)
endwhile()
set("${outvar}" "${${outvar}}" PARENT_SCOPE)
endfunction(generate_pkg_config_path)
function(transform_pc_file INPUT_FILE OUTPUT_FILE VERSION)
file(READ ${INPUT_FILE} TEXT)
set(PREFIX "${CMAKE_INSTALL_PREFIX}")
string(REGEX REPLACE "@prefix@" "${PREFIX}" TEXT ${TEXT})
string(REGEX REPLACE "@exec_prefix@" "${PREFIX}" TEXT ${TEXT})
string(REGEX REPLACE "@libm@" "${LIBM_DEP}" TEXT ${TEXT})
generate_pkg_config_path(LIBDIR "${CMAKE_INSTALL_FULL_LIBDIR}" prefix "${PREFIX}")
string(REGEX REPLACE "@libdir@" "${LIBDIR}" TEXT ${TEXT})
generate_pkg_config_path(INCLUDEDIR "${CMAKE_INSTALL_FULL_INCLUDEDIR}" prefix "${PREFIX}")
string(REGEX REPLACE "@includedir@" "${INCLUDEDIR}" TEXT ${TEXT})
string(REGEX REPLACE "@PACKAGE_VERSION@" "${VERSION}" TEXT ${TEXT})
file(WRITE ${OUTPUT_FILE} ${TEXT})
endfunction()
transform_pc_file("scripts/libbrotlicommon.pc.in" "${CMAKE_CURRENT_BINARY_DIR}/libbrotlicommon.pc" "${BROTLI_VERSION}")
transform_pc_file("scripts/libbrotlidec.pc.in" "${CMAKE_CURRENT_BINARY_DIR}/libbrotlidec.pc" "${BROTLI_VERSION}")
transform_pc_file("scripts/libbrotlienc.pc.in" "${CMAKE_CURRENT_BINARY_DIR}/libbrotlienc.pc" "${BROTLI_VERSION}")
if (NOT BROTLI_BUNDLED_MODE)
install(FILES "${CMAKE_CURRENT_BINARY_DIR}/libbrotlicommon.pc"
DESTINATION "${CMAKE_INSTALL_LIBDIR}/pkgconfig")
install(FILES "${CMAKE_CURRENT_BINARY_DIR}/libbrotlidec.pc"
DESTINATION "${CMAKE_INSTALL_LIBDIR}/pkgconfig")
install(FILES "${CMAKE_CURRENT_BINARY_DIR}/libbrotlienc.pc"
DESTINATION "${CMAKE_INSTALL_LIBDIR}/pkgconfig")
endif() # BROTLI_BUNDLED_MODE
if (BROTLI_BUILD_TOOLS)
install(FILES "docs/brotli.1"
DESTINATION "${CMAKE_INSTALL_FULL_MANDIR}/man1")
endif()
install(FILES docs/constants.h.3 docs/decode.h.3 docs/encode.h.3 docs/types.h.3
DESTINATION "${CMAKE_INSTALL_FULL_MANDIR}/man3")
if (ENABLE_COVERAGE STREQUAL "yes")
setup_target_for_coverage(coverage test coverage)
endif()

View File

@@ -21,6 +21,11 @@ frustration later on.
All submissions, including submissions by project members, require review. We
use Github pull requests for this purpose.
### Code style
Code should follow applicable formatting and style guides described in
[Google Style Guides](https://google.github.io/styleguide/). C code should be
C89 compatible.
### The small print
Contributions made by corporations are covered by a different agreement than
the one above, the [Software Grant and Corporate Contributor License Agreement]

View File

@@ -1,18 +1,19 @@
include CONTRIBUTING.md
include dec/*.c
include dec/*.h
include dec/Makefile
include enc/*.cc
include enc/*.h
include enc/Makefile
include c/common/*.c
include c/common/*.h
include c/dec/*.c
include c/dec/*.h
include c/enc/*.c
include c/enc/*.h
include c/include/brotli/*.h
include LICENSE
include MANIFEST.in
include python/_brotli.cc
include python/bro.py
include python/brotlimodule.cc
include python/brotli.py
include python/README.md
include python/tests/*
include README.md
include setup.py
include shared.mk
include tools/bro.cc
include tools/Makefile
include tools/version.h
include tests/testdata/*
include c/tools/brotli.c

12
MODULE.bazel Normal file
View File

@@ -0,0 +1,12 @@
# Copyright 2025 The Brotli Authors. All rights reserved.
#
# Distributed under MIT license.
# See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
"""Brotli reference implementation"""
module(
name = "brotli",
version = "1.2.0",
repo_name = "org_brotli",
)

15
README Normal file
View File

@@ -0,0 +1,15 @@
BROTLI DATA COMPRESSION LIBRARY
Brotli is a generic-purpose lossless compression algorithm that compresses data
using a combination of a modern variant of the LZ77 algorithm, Huffman coding
and 2nd order context modeling, with a compression ratio comparable to the best
currently available general-purpose compression methods. It is similar in speed
with deflate but offers more dense compression.
The specification of the Brotli Compressed Data Format is defined in RFC 7932
https://datatracker.ietf.org/doc/html/rfc7932
Brotli is open-sourced under the MIT License, see the LICENSE file.
Brotli mailing list:
https://groups.google.com/g/brotli

111
README.md
View File

@@ -1,5 +1,10 @@
brotli
======
<p align="center">
<img src="https://github.com/google/brotli/actions/workflows/build_test.yml/badge.svg" alt="GitHub Actions Build Status" href="https://github.com/google/brotli/actions?query=branch%3Amaster">
<img src="https://oss-fuzz-build-logs.storage.googleapis.com/badges/brotli.svg" alt="Fuzzing Status" href="https://oss-fuzz-build-logs.storage.googleapis.com/index.html#brotli">
</p>
<p align="center"><img src="https://brotli.org/brotli.svg" alt="Brotli" width="64"></p>
### Introduction
Brotli is a generic-purpose lossless compression algorithm that compresses data
using a combination of a modern variant of the LZ77 algorithm, Huffman coding
@@ -7,11 +12,103 @@ and 2nd order context modeling, with a compression ratio comparable to the best
currently available general-purpose compression methods. It is similar in speed
with deflate but offers more dense compression.
The specification of the Brotli Compressed Data Format is defined in the
following internet draft:
http://www.ietf.org/id/draft-alakuijala-brotli
The specification of the Brotli Compressed Data Format is defined in
[RFC 7932](https://datatracker.ietf.org/doc/html/rfc7932).
Brotli is open-sourced under the MIT License, see the LICENSE file.
Brotli mailing list:
https://groups.google.com/forum/#!forum/brotli
> **Please note:** brotli is a "stream" format; it does not contain
> meta-information, like checksums or uncompressed data length. It is possible
> to modify "raw" ranges of the compressed stream and the decoder will not
> notice that.
### Installation
In most Linux distributions, installing `brotli` is just a matter of using
the package management system. For example in Debian-based distributions:
`apt install brotli` will install `brotli`. On MacOS, you can use
[Homebrew](https://brew.sh/): `brew install brotli`.
[![brotli packaging status](https://repology.org/badge/vertical-allrepos/brotli.svg?exclude_unsupported=1&columns=3&exclude_sources=modules,site&header=brotli%20packaging%20status)](https://repology.org/project/brotli/versions)
Of course you can also build brotli from sources.
### Build instructions
#### Vcpkg
You can download and install brotli using the
[vcpkg](https://github.com/Microsoft/vcpkg/) dependency manager:
git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg integrate install
./vcpkg install brotli
The brotli port in vcpkg is kept up to date by Microsoft team members and
community contributors. If the version is out of date, please [create an issue
or pull request](https://github.com/Microsoft/vcpkg) on the vcpkg repository.
#### Bazel
See [Bazel](https://www.bazel.build/)
#### CMake
The basic commands to build and install brotli are:
$ mkdir out && cd out
$ cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=./installed ..
$ cmake --build . --config Release --target install
You can use other [CMake](https://cmake.org/) configuration.
#### Python
To install the latest release of the Python module, run the following:
$ pip install brotli
To install the tip-of-the-tree version, run:
$ pip install --upgrade git+https://github.com/google/brotli
See the [Python readme](python/README.md) for more details on installing
from source, development, and testing.
### Contributing
We glad to answer/library related questions in
[brotli mailing list](https://groups.google.com/g/brotli).
Regular issues / feature requests should be reported in
[issue tracker](https://github.com/google/brotli/issues).
For reporting vulnerability please read [SECURITY](SECURITY.md).
For contributing changes please read [CONTRIBUTING](CONTRIBUTING.md).
### Benchmarks
* [Squash Compression Benchmark](https://quixdb.github.io/squash-benchmark/) / [Unstable Squash Compression Benchmark](https://quixdb.github.io/squash-benchmark/unstable/)
* [Large Text Compression Benchmark](https://mattmahoney.net/dc/text.html)
* [Lzturbo Benchmark](https://sites.google.com/site/powturbo/home/benchmark)
### Related projects
> **Disclaimer:** Brotli authors take no responsibility for the third party projects mentioned in this section.
Independent [decoder](https://github.com/madler/brotli) implementation
by Mark Adler, based entirely on format specification.
JavaScript port of brotli [decoder](https://github.com/devongovett/brotli.js).
Could be used directly via `npm install brotli`
Hand ported [decoder / encoder](https://github.com/dominikhlbg/BrotliHaxe)
in haxe by Dominik Homberger.
Output source code: JavaScript, PHP, Python, Java and C#
7Zip [plugin](https://github.com/mcmilk/7-Zip-Zstd)
Dart compression framework with
[fast FFI-based Brotli implementation](https://pub.dev/documentation/es_compression/latest/brotli/)
with ready-to-use prebuilt binaries for Win/Linux/Mac

6
SECURITY.md Normal file
View File

@@ -0,0 +1,6 @@
### Reporting
To report a security issue, please use [https://g.co/vulnz](https://g.co/vulnz).
We use g.co/vulnz for our intake, and do coordination and disclosure here on
GitHub (including using GitHub Security Advisory). The Google Security Team will
respond within 5 working days of your report on g.co/vulnz.

View File

@@ -1,80 +0,0 @@
environment:
global:
# SDK v7.0 MSVC Express 2008's SetEnv.cmd script will fail if the
# /E:ON and /V:ON options are not enabled in the batch script intepreter
# See: http://stackoverflow.com/a/13751649/163740
WITH_COMPILER: "cmd /E:ON /V:ON /C .\\appveyor\\run_with_compiler.cmd"
matrix:
- PYTHON: "C:\\Python27"
PYTHON_VERSION: "2.7.x"
PYTHON_ARCH: "32"
- PYTHON: "C:\\Python34"
PYTHON_VERSION: "3.4.x"
PYTHON_ARCH: "32"
- PYTHON: "C:\\Python35"
PYTHON_VERSION: "3.5.0"
PYTHON_ARCH: "32"
- PYTHON: "C:\\Python27-x64"
PYTHON_VERSION: "2.7.x"
PYTHON_ARCH: "64"
- PYTHON: "C:\\Python34-x64"
PYTHON_VERSION: "3.4.x"
PYTHON_ARCH: "64"
- PYTHON: "C:\\Python35-x64"
PYTHON_VERSION: "3.5.0"
PYTHON_ARCH: "64"
init:
- "ECHO %PYTHON% %PYTHON_VERSION% %PYTHON_ARCH%"
install:
# install Python and pip when not already installed
- ps: if (-not(Test-Path($env:PYTHON))) { & appveyor\install.ps1 }
# prepend newly installed Python to the PATH
- "SET PATH=%PYTHON%;%PYTHON%\\Scripts;%PATH%"
# check that we have the expected version and architecture for Python
- "python --version"
- "python -c \"import struct; print(struct.calcsize('P') * 8)\""
# upgrade pip to avoid out-of-date warnings
- "pip install --disable-pip-version-check --user --upgrade pip"
# install/upgrade setuptools and wheel to build packages
- "pip install --upgrade setuptools wheel"
build: false
test_script:
- "%WITH_COMPILER% python setup.py build_ext test"
after_test:
# if tests are successful, create binary and source packages for the project
- "%WITH_COMPILER% pip wheel -w dist ."
- "%WITH_COMPILER% python setup.py sdist --formats=gztar,zip"
artifacts:
# archive the generated packages in the ci.appveyor.com build report
- path: dist\*.whl
- path: dist\*.zip
- path: dist\*.tar.gz
# For info, see: http://www.appveyor.com/docs/deployment/github
deploy:
- provider: GitHub
auth_token:
secure: dfL56DgbwuGJNNE5GzKi/pAgBQnJ37Du+AnCtnsTnIYxpis8ah3fPmA/G+bn4NJ3
artifact:
draft: false
prerelease: false
on:
appveyor_repo_tag: true

View File

@@ -1,177 +0,0 @@
# Sample script to install Python and pip under Windows
# Authors: Olivier Grisel, Jonathan Helmus, Kyle Kastner, and Alex Willmer
# License: CC0 1.0 Universal: http://creativecommons.org/publicdomain/zero/1.0/
# Source: https://github.com/ogrisel/python-appveyor-demo/blob/master/appveyor/install.ps1
$BASE_URL = "https://www.python.org/ftp/python/"
$GET_PIP_URL = "https://bootstrap.pypa.io/get-pip.py"
$GET_PIP_PATH = "C:\get-pip.py"
$PYTHON_PRERELEASE_REGEX = @"
(?x)
(?<major>\d+)
\.
(?<minor>\d+)
\.
(?<micro>\d+)
(?<prerelease>[a-z]{1,2}\d+)
"@
function Download ($filename, $url) {
$webclient = New-Object System.Net.WebClient
$basedir = $pwd.Path + "\"
$filepath = $basedir + $filename
if (Test-Path $filename) {
Write-Host "Reusing" $filepath
return $filepath
}
# Download and retry up to 3 times in case of network transient errors.
Write-Host "Downloading" $filename "from" $url
$retry_attempts = 2
for ($i = 0; $i -lt $retry_attempts; $i++) {
try {
$webclient.DownloadFile($url, $filepath)
break
}
Catch [Exception]{
Start-Sleep 1
}
}
if (Test-Path $filepath) {
Write-Host "File saved at" $filepath
} else {
# Retry once to get the error message if any at the last try
$webclient.DownloadFile($url, $filepath)
}
return $filepath
}
function ParsePythonVersion ($python_version) {
if ($python_version -match $PYTHON_PRERELEASE_REGEX) {
return ([int]$matches.major, [int]$matches.minor, [int]$matches.micro,
$matches.prerelease)
}
$version_obj = [version]$python_version
return ($version_obj.major, $version_obj.minor, $version_obj.build, "")
}
function DownloadPython ($python_version, $platform_suffix) {
$major, $minor, $micro, $prerelease = ParsePythonVersion $python_version
if (($major -le 2 -and $micro -eq 0) `
-or ($major -eq 3 -and $minor -le 2 -and $micro -eq 0) `
) {
$dir = "$major.$minor"
$python_version = "$major.$minor$prerelease"
} else {
$dir = "$major.$minor.$micro"
}
if ($prerelease) {
if (($major -le 2) `
-or ($major -eq 3 -and $minor -eq 1) `
-or ($major -eq 3 -and $minor -eq 2) `
-or ($major -eq 3 -and $minor -eq 3) `
) {
$dir = "$dir/prev"
}
}
if (($major -le 2) -or ($major -le 3 -and $minor -le 4)) {
$ext = "msi"
if ($platform_suffix) {
$platform_suffix = ".$platform_suffix"
}
} else {
$ext = "exe"
if ($platform_suffix) {
$platform_suffix = "-$platform_suffix"
}
}
$filename = "python-$python_version$platform_suffix.$ext"
$url = "$BASE_URL$dir/$filename"
$filepath = Download $filename $url
return $filepath
}
function InstallPython ($python_version, $architecture, $python_home) {
Write-Host "Installing Python" $python_version "for" $architecture "bit architecture to" $python_home
if (Test-Path $python_home) {
Write-Host $python_home "already exists, skipping."
return $false
}
if ($architecture -eq "32") {
$platform_suffix = ""
} else {
$platform_suffix = "amd64"
}
$installer_path = DownloadPython $python_version $platform_suffix
$installer_ext = [System.IO.Path]::GetExtension($installer_path)
Write-Host "Installing $installer_path to $python_home"
$install_log = $python_home + ".log"
if ($installer_ext -eq '.msi') {
InstallPythonMSI $installer_path $python_home $install_log
} else {
InstallPythonEXE $installer_path $python_home $install_log
}
if (Test-Path $python_home) {
Write-Host "Python $python_version ($architecture) installation complete"
} else {
Write-Host "Failed to install Python in $python_home"
Get-Content -Path $install_log
Exit 1
}
}
function InstallPythonEXE ($exepath, $python_home, $install_log) {
$install_args = "/quiet InstallAllUsers=1 TargetDir=$python_home"
RunCommand $exepath $install_args
}
function InstallPythonMSI ($msipath, $python_home, $install_log) {
$install_args = "/qn /log $install_log /i $msipath TARGETDIR=$python_home"
$uninstall_args = "/qn /x $msipath"
RunCommand "msiexec.exe" $install_args
if (-not(Test-Path $python_home)) {
Write-Host "Python seems to be installed else-where, reinstalling."
RunCommand "msiexec.exe" $uninstall_args
RunCommand "msiexec.exe" $install_args
}
}
function RunCommand ($command, $command_args) {
Write-Host $command $command_args
Start-Process -FilePath $command -ArgumentList $command_args -Wait -Passthru
}
function InstallPip ($python_home) {
$pip_path = $python_home + "\Scripts\pip.exe"
$python_path = $python_home + "\python.exe"
if (-not(Test-Path $pip_path)) {
Write-Host "Installing pip..."
$webclient = New-Object System.Net.WebClient
$webclient.DownloadFile($GET_PIP_URL, $GET_PIP_PATH)
Write-Host "Executing:" $python_path $GET_PIP_PATH
& $python_path $GET_PIP_PATH
} else {
Write-Host "pip already installed."
}
}
function main () {
InstallPython $env:PYTHON_VERSION $env:PYTHON_ARCH $env:PYTHON
InstallPip $env:PYTHON
}
main

View File

@@ -1,80 +0,0 @@
:: To build extensions for 64 bit Python 3, we need to configure environment
:: variables to use the MSVC 2010 C++ compilers from GRMSDKX_EN_DVD.iso of:
:: MS Windows SDK for Windows 7 and .NET Framework 4 (SDK v7.1)
::
:: To build extensions for 64 bit Python 2, we need to configure environment
:: variables to use the MSVC 2008 C++ compilers from GRMSDKX_EN_DVD.iso of:
:: MS Windows SDK for Windows 7 and .NET Framework 3.5 (SDK v7.0)
::
:: 32 bit builds, and 64-bit builds for 3.5 and beyond, do not require specific
:: environment configurations.
::
:: Note: this script needs to be run with the /E:ON and /V:ON flags for the
:: cmd interpreter, at least for (SDK v7.0)
::
:: More details at:
:: https://github.com/cython/cython/wiki/64BitCythonExtensionsOnWindows
:: http://stackoverflow.com/a/13751649/163740
::
:: Original source:
:: https://github.com/ogrisel/python-appveyor-demo/blob/master/appveyor/run_with_env.cmd
::
:: Author: Olivier Grisel
:: License: CC0 1.0 Universal: http://creativecommons.org/publicdomain/zero/1.0/
@ECHO OFF
SET COMMAND_TO_RUN=%*
SET WIN_SDK_ROOT=C:\Program Files\Microsoft SDKs\Windows
SET WIN_WDK=c:\Program Files (x86)\Windows Kits\10\Include\wdf
:: Extract the major and minor versions, and allow for the minor version to be
:: more than 9. This requires the version number to have two dots in it.
SET MAJOR_PYTHON_VERSION=%PYTHON_VERSION:~0,1%
IF "%PYTHON_VERSION:~3,1%" == "." (
SET MINOR_PYTHON_VERSION=%PYTHON_VERSION:~2,1%
) ELSE (
SET MINOR_PYTHON_VERSION=%PYTHON_VERSION:~2,2%
)
:: Based on the Python version, determine what SDK version to use, and whether
:: to set the SDK for 64-bit.
IF %MAJOR_PYTHON_VERSION% == 2 (
SET WINDOWS_SDK_VERSION="v7.0"
SET SET_SDK_64=Y
) ELSE (
IF %MAJOR_PYTHON_VERSION% == 3 (
SET WINDOWS_SDK_VERSION="v7.1"
IF %MINOR_PYTHON_VERSION% LEQ 4 (
SET SET_SDK_64=Y
) ELSE (
SET SET_SDK_64=N
IF EXIST "%WIN_WDK%" (
:: See: https://connect.microsoft.com/VisualStudio/feedback/details/1610302/
REN "%WIN_WDK%" 0wdf
)
)
) ELSE (
ECHO Unsupported Python version: "%MAJOR_PYTHON_VERSION%"
EXIT 1
)
)
IF %PYTHON_ARCH% == 64 (
IF %SET_SDK_64% == Y (
ECHO Configuring Windows SDK %WINDOWS_SDK_VERSION% for Python %MAJOR_PYTHON_VERSION% on a 64 bit architecture
SET DISTUTILS_USE_SDK=1
SET MSSdk=1
"%WIN_SDK_ROOT%\%WINDOWS_SDK_VERSION%\Setup\WindowsSdkVer.exe" -q -version:%WINDOWS_SDK_VERSION%
"%WIN_SDK_ROOT%\%WINDOWS_SDK_VERSION%\Bin\SetEnv.cmd" /x64 /release
ECHO Executing: %COMMAND_TO_RUN%
call %COMMAND_TO_RUN% || EXIT 1
) ELSE (
ECHO Using default MSVC build environment for 64 bit architecture
ECHO Executing: %COMMAND_TO_RUN%
call %COMMAND_TO_RUN% || EXIT 1
)
) ELSE (
ECHO Using default MSVC build environment for 32 bit architecture
ECHO Executing: %COMMAND_TO_RUN%
call %COMMAND_TO_RUN% || EXIT 1
)

15
c/common/constants.c Normal file
View File

@@ -0,0 +1,15 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "constants.h"
const BROTLI_MODEL("small")
BrotliPrefixCodeRange _kBrotliPrefixCodeRanges[BROTLI_NUM_BLOCK_LEN_SYMBOLS] = {
{1, 2}, {5, 2}, {9, 2}, {13, 2}, {17, 3}, {25, 3},
{33, 3}, {41, 3}, {49, 4}, {65, 4}, {81, 4}, {97, 4},
{113, 5}, {145, 5}, {177, 5}, {209, 5}, {241, 6}, {305, 6},
{369, 7}, {497, 8}, {753, 9}, {1265, 10}, {2289, 11}, {4337, 12},
{8433, 13}, {16625, 24}};

198
c/common/constants.h Normal file
View File

@@ -0,0 +1,198 @@
/* Copyright 2016 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/**
* @file
* Common constants used in decoder and encoder API.
*/
#ifndef BROTLI_COMMON_CONSTANTS_H_
#define BROTLI_COMMON_CONSTANTS_H_
#include "platform.h"
/* Specification: 7.3. Encoding of the context map */
#define BROTLI_CONTEXT_MAP_MAX_RLE 16
/* Specification: 2. Compressed representation overview */
#define BROTLI_MAX_NUMBER_OF_BLOCK_TYPES 256
/* Specification: 3.3. Alphabet sizes: insert-and-copy length */
#define BROTLI_NUM_LITERAL_SYMBOLS 256
#define BROTLI_NUM_COMMAND_SYMBOLS 704
#define BROTLI_NUM_BLOCK_LEN_SYMBOLS 26
#define BROTLI_MAX_CONTEXT_MAP_SYMBOLS (BROTLI_MAX_NUMBER_OF_BLOCK_TYPES + \
BROTLI_CONTEXT_MAP_MAX_RLE)
#define BROTLI_MAX_BLOCK_TYPE_SYMBOLS (BROTLI_MAX_NUMBER_OF_BLOCK_TYPES + 2)
/* Specification: 3.5. Complex prefix codes */
#define BROTLI_REPEAT_PREVIOUS_CODE_LENGTH 16
#define BROTLI_REPEAT_ZERO_CODE_LENGTH 17
#define BROTLI_CODE_LENGTH_CODES (BROTLI_REPEAT_ZERO_CODE_LENGTH + 1)
/* "code length of 8 is repeated" */
#define BROTLI_INITIAL_REPEATED_CODE_LENGTH 8
/* "Large Window Brotli" */
/**
* The theoretical maximum number of distance bits specified for large window
* brotli, for 64-bit encoders and decoders. Even when in practice 32-bit
* encoders and decoders only support up to 30 max distance bits, the value is
* set to 62 because it affects the large window brotli file format.
* Specifically, it affects the encoding of simple huffman tree for distances,
* see Specification RFC 7932 chapter 3.4.
*/
#define BROTLI_LARGE_MAX_DISTANCE_BITS 62U
#define BROTLI_LARGE_MIN_WBITS 10
/**
* The maximum supported large brotli window bits by the encoder and decoder.
* Large window brotli allows up to 62 bits, however the current encoder and
* decoder, designed for 32-bit integers, only support up to 30 bits maximum.
*/
#define BROTLI_LARGE_MAX_WBITS 30
/* Specification: 4. Encoding of distances */
#define BROTLI_NUM_DISTANCE_SHORT_CODES 16
/**
* Maximal number of "postfix" bits.
*
* Number of "postfix" bits is stored as 2 bits in meta-block header.
*/
#define BROTLI_MAX_NPOSTFIX 3
#define BROTLI_MAX_NDIRECT 120
#define BROTLI_MAX_DISTANCE_BITS 24U
#define BROTLI_DISTANCE_ALPHABET_SIZE(NPOSTFIX, NDIRECT, MAXNBITS) ( \
BROTLI_NUM_DISTANCE_SHORT_CODES + (NDIRECT) + \
((MAXNBITS) << ((NPOSTFIX) + 1)))
/* BROTLI_NUM_DISTANCE_SYMBOLS == 1128 */
#define BROTLI_NUM_DISTANCE_SYMBOLS \
BROTLI_DISTANCE_ALPHABET_SIZE( \
BROTLI_MAX_NDIRECT, BROTLI_MAX_NPOSTFIX, BROTLI_LARGE_MAX_DISTANCE_BITS)
/* ((1 << 26) - 4) is the maximal distance that can be expressed in RFC 7932
brotli stream using NPOSTFIX = 0 and NDIRECT = 0. With other NPOSTFIX and
NDIRECT values distances up to ((1 << 29) + 88) could be expressed. */
#define BROTLI_MAX_DISTANCE 0x3FFFFFC
/* ((1 << 31) - 4) is the safe distance limit. Using this number as a limit
allows safe distance calculation without overflows, given the distance
alphabet size is limited to corresponding size
(see kLargeWindowDistanceCodeLimits). */
#define BROTLI_MAX_ALLOWED_DISTANCE 0x7FFFFFFC
/* Specification: 4. Encoding of Literal Insertion Lengths and Copy Lengths */
#define BROTLI_NUM_INS_COPY_CODES 24
/* 7.1. Context modes and context ID lookup for literals */
/* "context IDs for literals are in the range of 0..63" */
#define BROTLI_LITERAL_CONTEXT_BITS 6
/* 7.2. Context ID for distances */
#define BROTLI_DISTANCE_CONTEXT_BITS 2
/* 9.1. Format of the Stream Header */
/* Number of slack bytes for window size. Don't confuse
with BROTLI_NUM_DISTANCE_SHORT_CODES. */
#define BROTLI_WINDOW_GAP 16
#define BROTLI_MAX_BACKWARD_LIMIT(W) (((size_t)1 << (W)) - BROTLI_WINDOW_GAP)
typedef struct BrotliDistanceCodeLimit {
uint32_t max_alphabet_size;
uint32_t max_distance;
} BrotliDistanceCodeLimit;
/* This function calculates maximal size of distance alphabet, such that the
distances greater than the given values can not be represented.
This limits are designed to support fast and safe 32-bit decoders.
"32-bit" means that signed integer values up to ((1 << 31) - 1) could be
safely expressed.
Brotli distance alphabet symbols do not represent consecutive distance
ranges. Each distance alphabet symbol (excluding direct distances and short
codes), represent interleaved (for NPOSTFIX > 0) range of distances.
A "group" of consecutive (1 << NPOSTFIX) symbols represent non-interleaved
range. Two consecutive groups require the same amount of "extra bits".
It is important that distance alphabet represents complete "groups".
To avoid complex logic on encoder side about interleaved ranges
it was decided to restrict both sides to complete distance code "groups".
*/
BROTLI_UNUSED_FUNCTION BrotliDistanceCodeLimit BrotliCalculateDistanceCodeLimit(
uint32_t max_distance, uint32_t npostfix, uint32_t ndirect) {
BrotliDistanceCodeLimit result;
/* Marking this function as unused, because not all files
including "constants.h" use it -> compiler warns about that. */
BROTLI_UNUSED(&BrotliCalculateDistanceCodeLimit);
if (max_distance <= ndirect) {
/* This case never happens / exists only for the sake of completeness. */
result.max_alphabet_size = max_distance + BROTLI_NUM_DISTANCE_SHORT_CODES;
result.max_distance = max_distance;
return result;
} else {
/* The first prohibited value. */
uint32_t forbidden_distance = max_distance + 1;
/* Subtract "directly" encoded region. */
uint32_t offset = forbidden_distance - ndirect - 1;
uint32_t ndistbits = 0;
uint32_t tmp;
uint32_t half;
uint32_t group;
/* Postfix for the last dcode in the group. */
uint32_t postfix = (1u << npostfix) - 1;
uint32_t extra;
uint32_t start;
/* Remove postfix and "head-start". */
offset = (offset >> npostfix) + 4;
/* Calculate the number of distance bits. */
tmp = offset / 2;
/* Poor-man's log2floor, to avoid extra dependencies. */
while (tmp != 0) {ndistbits++; tmp = tmp >> 1;}
/* One bit is covered with subrange addressing ("half"). */
ndistbits--;
/* Find subrange. */
half = (offset >> ndistbits) & 1;
/* Calculate the "group" part of dcode. */
group = ((ndistbits - 1) << 1) | half;
/* Calculated "group" covers the prohibited distance value. */
if (group == 0) {
/* This case is added for correctness; does not occur for limit > 128. */
result.max_alphabet_size = ndirect + BROTLI_NUM_DISTANCE_SHORT_CODES;
result.max_distance = ndirect;
return result;
}
/* Decrement "group", so it is the last permitted "group". */
group--;
/* After group was decremented, ndistbits and half must be recalculated. */
ndistbits = (group >> 1) + 1;
/* The last available distance in the subrange has all extra bits set. */
extra = (1u << ndistbits) - 1;
/* Calculate region start. NB: ndistbits >= 1. */
start = (1u << (ndistbits + 1)) - 4;
/* Move to subregion. */
start += (group & 1) << ndistbits;
/* Calculate the alphabet size. */
result.max_alphabet_size = ((group << npostfix) | postfix) + ndirect +
BROTLI_NUM_DISTANCE_SHORT_CODES + 1;
/* Calculate the maximal distance representable by alphabet. */
result.max_distance = ((start + extra) << npostfix) + postfix + ndirect + 1;
return result;
}
}
/* Represents the range of values belonging to a prefix code:
[offset, offset + 2^nbits) */
typedef struct {
uint16_t offset;
uint8_t nbits;
} BrotliPrefixCodeRange;
/* "Soft-private", it is exported, but not "advertised" as API. */
BROTLI_COMMON_API extern const BROTLI_MODEL("small")
BrotliPrefixCodeRange _kBrotliPrefixCodeRanges[BROTLI_NUM_BLOCK_LEN_SYMBOLS];
#endif /* BROTLI_COMMON_CONSTANTS_H_ */

View File

@@ -1,115 +1,81 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
#include "context.h"
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Lookup table to map the previous two bytes to a context id.
There are four different context modeling modes defined here:
CONTEXT_LSB6: context id is the least significant 6 bits of the last byte,
CONTEXT_MSB6: context id is the most significant 6 bits of the last byte,
CONTEXT_UTF8: second-order context model tuned for UTF8-encoded text,
CONTEXT_SIGNED: second-order context model tuned for signed integers.
The context id for the UTF8 context model is calculated as follows. If p1
and p2 are the previous two bytes, we calculate the context as
context = kContextLookup[p1] | kContextLookup[p2 + 256].
If the previous two bytes are ASCII characters (i.e. < 128), this will be
equivalent to
context = 4 * context1(p1) + context2(p2),
where context1 is based on the previous byte in the following way:
0 : non-ASCII control
1 : \t, \n, \r
2 : space
3 : other punctuation
4 : " '
5 : %
6 : ( < [ {
7 : ) > ] }
8 : , ; :
9 : .
10 : =
11 : number
12 : upper-case vowel
13 : upper-case consonant
14 : lower-case vowel
15 : lower-case consonant
and context2 is based on the second last byte:
0 : control, space
1 : punctuation
2 : upper-case letter, number
3 : lower-case letter
If the last byte is ASCII, and the second last byte is not (in a valid UTF8
stream it will be a continuation byte, value between 128 and 191), the
context is the same as if the second last byte was an ASCII control or space.
If the last byte is a UTF8 lead byte (value >= 192), then the next byte will
be a continuation byte and the context id is 2 or 3 depending on the LSB of
the last byte and to a lesser extent on the second last byte if it is ASCII.
If the last byte is a UTF8 continuation byte, the second last byte can be:
- continuation byte: the next byte is probably ASCII or lead byte (assuming
4-byte UTF8 characters are rare) and the context id is 0 or 1.
- lead byte (192 - 207): next byte is ASCII or lead byte, context is 0 or 1
- lead byte (208 - 255): next byte is continuation byte, context is 2 or 3
The possible value combinations of the previous two bytes, the range of
context ids and the type of the next byte is summarized in the table below:
|--------\-----------------------------------------------------------------|
| \ Last byte |
| Second \---------------------------------------------------------------|
| last byte \ ASCII | cont. byte | lead byte |
| \ (0-127) | (128-191) | (192-) |
|=============|===================|=====================|==================|
| ASCII | next: ASCII/lead | not valid | next: cont. |
| (0-127) | context: 4 - 63 | | context: 2 - 3 |
|-------------|-------------------|---------------------|------------------|
| cont. byte | next: ASCII/lead | next: ASCII/lead | next: cont. |
| (128-191) | context: 4 - 63 | context: 0 - 1 | context: 2 - 3 |
|-------------|-------------------|---------------------|------------------|
| lead byte | not valid | next: ASCII/lead | not valid |
| (192-207) | | context: 0 - 1 | |
|-------------|-------------------|---------------------|------------------|
| lead byte | not valid | next: cont. | not valid |
| (208-) | | context: 2 - 3 | |
|-------------|-------------------|---------------------|------------------|
The context id for the signed context mode is calculated as:
context = (kContextLookup[512 + p1] << 3) | kContextLookup[512 + p2].
For any context modeling modes, the context ids can be calculated by |-ing
together two lookups from one table using context model dependent offsets:
context = kContextLookup[offset1 + p1] | kContextLookup[offset2 + p2].
where offset1 and offset2 are dependent on the context mode.
*/
#ifndef BROTLI_DEC_CONTEXT_H_
#define BROTLI_DEC_CONTEXT_H_
#include "./types.h"
enum ContextType {
CONTEXT_LSB6 = 0,
CONTEXT_MSB6 = 1,
CONTEXT_UTF8 = 2,
CONTEXT_SIGNED = 3
};
#include "platform.h"
/* Common context lookup table for all context modes. */
static const uint8_t kContextLookup[1792] = {
const BROTLI_MODEL("small") uint8_t _kBrotliContextLookupTable[2048] = {
/* CONTEXT_LSB6, last byte. */
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
/* CONTEXT_LSB6, second last byte, */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
/* CONTEXT_MSB6, last byte. */
0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7,
8, 8, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11,
12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 15,
16, 16, 16, 16, 17, 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 19,
20, 20, 20, 20, 21, 21, 21, 21, 22, 22, 22, 22, 23, 23, 23, 23,
24, 24, 24, 24, 25, 25, 25, 25, 26, 26, 26, 26, 27, 27, 27, 27,
28, 28, 28, 28, 29, 29, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31,
32, 32, 32, 32, 33, 33, 33, 33, 34, 34, 34, 34, 35, 35, 35, 35,
36, 36, 36, 36, 37, 37, 37, 37, 38, 38, 38, 38, 39, 39, 39, 39,
40, 40, 40, 40, 41, 41, 41, 41, 42, 42, 42, 42, 43, 43, 43, 43,
44, 44, 44, 44, 45, 45, 45, 45, 46, 46, 46, 46, 47, 47, 47, 47,
48, 48, 48, 48, 49, 49, 49, 49, 50, 50, 50, 50, 51, 51, 51, 51,
52, 52, 52, 52, 53, 53, 53, 53, 54, 54, 54, 54, 55, 55, 55, 55,
56, 56, 56, 56, 57, 57, 57, 57, 58, 58, 58, 58, 59, 59, 59, 59,
60, 60, 60, 60, 61, 61, 61, 61, 62, 62, 62, 62, 63, 63, 63, 63,
/* CONTEXT_MSB6, second last byte, */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
/* CONTEXT_UTF8, last byte. */
/* ASCII range. */
0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 4, 0, 0, 4, 0, 0,
@@ -130,6 +96,7 @@ static const uint8_t kContextLookup[1792] = {
2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3,
2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3,
2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3, 2, 3,
/* CONTEXT_UTF8 second last byte. */
/* ASCII range. */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
@@ -150,23 +117,7 @@ static const uint8_t kContextLookup[1792] = {
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
/* CONTEXT_SIGNED, second last byte. */
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7,
/* CONTEXT_SIGNED, last byte, same as the above values shifted by 3 bits. */
0, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8,
16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
@@ -184,68 +135,22 @@ static const uint8_t kContextLookup[1792] = {
40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 48, 56,
/* CONTEXT_LSB6, last byte. */
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
/* CONTEXT_MSB6, last byte. */
0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3,
4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7,
8, 8, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11,
12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 15,
16, 16, 16, 16, 17, 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 19,
20, 20, 20, 20, 21, 21, 21, 21, 22, 22, 22, 22, 23, 23, 23, 23,
24, 24, 24, 24, 25, 25, 25, 25, 26, 26, 26, 26, 27, 27, 27, 27,
28, 28, 28, 28, 29, 29, 29, 29, 30, 30, 30, 30, 31, 31, 31, 31,
32, 32, 32, 32, 33, 33, 33, 33, 34, 34, 34, 34, 35, 35, 35, 35,
36, 36, 36, 36, 37, 37, 37, 37, 38, 38, 38, 38, 39, 39, 39, 39,
40, 40, 40, 40, 41, 41, 41, 41, 42, 42, 42, 42, 43, 43, 43, 43,
44, 44, 44, 44, 45, 45, 45, 45, 46, 46, 46, 46, 47, 47, 47, 47,
48, 48, 48, 48, 49, 49, 49, 49, 50, 50, 50, 50, 51, 51, 51, 51,
52, 52, 52, 52, 53, 53, 53, 53, 54, 54, 54, 54, 55, 55, 55, 55,
56, 56, 56, 56, 57, 57, 57, 57, 58, 58, 58, 58, 59, 59, 59, 59,
60, 60, 60, 60, 61, 61, 61, 61, 62, 62, 62, 62, 63, 63, 63, 63,
/* CONTEXT_{M,L}SB6, second last byte, */
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
};
static const int kContextLookupOffsets[8] = {
/* CONTEXT_LSB6 */
1024, 1536,
/* CONTEXT_MSB6 */
1280, 1536,
/* CONTEXT_UTF8 */
0, 256,
/* CONTEXT_SIGNED */
768, 512,
/* CONTEXT_SIGNED, second last byte. */
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7,
};
#endif /* BROTLI_DEC_CONTEXT_H_ */

112
c/common/context.h Normal file
View File

@@ -0,0 +1,112 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Lookup table to map the previous two bytes to a context id.
There are four different context modeling modes defined here:
CONTEXT_LSB6: context id is the least significant 6 bits of the last byte,
CONTEXT_MSB6: context id is the most significant 6 bits of the last byte,
CONTEXT_UTF8: second-order context model tuned for UTF8-encoded text,
CONTEXT_SIGNED: second-order context model tuned for signed integers.
If |p1| and |p2| are the previous two bytes, and |mode| is current context
mode, we calculate the context as:
context = ContextLut(mode)[p1] | ContextLut(mode)[p2 + 256].
For CONTEXT_UTF8 mode, if the previous two bytes are ASCII characters
(i.e. < 128), this will be equivalent to
context = 4 * context1(p1) + context2(p2),
where context1 is based on the previous byte in the following way:
0 : non-ASCII control
1 : \t, \n, \r
2 : space
3 : other punctuation
4 : " '
5 : %
6 : ( < [ {
7 : ) > ] }
8 : , ; :
9 : .
10 : =
11 : number
12 : upper-case vowel
13 : upper-case consonant
14 : lower-case vowel
15 : lower-case consonant
and context2 is based on the second last byte:
0 : control, space
1 : punctuation
2 : upper-case letter, number
3 : lower-case letter
If the last byte is ASCII, and the second last byte is not (in a valid UTF8
stream it will be a continuation byte, value between 128 and 191), the
context is the same as if the second last byte was an ASCII control or space.
If the last byte is a UTF8 lead byte (value >= 192), then the next byte will
be a continuation byte and the context id is 2 or 3 depending on the LSB of
the last byte and to a lesser extent on the second last byte if it is ASCII.
If the last byte is a UTF8 continuation byte, the second last byte can be:
- continuation byte: the next byte is probably ASCII or lead byte (assuming
4-byte UTF8 characters are rare) and the context id is 0 or 1.
- lead byte (192 - 207): next byte is ASCII or lead byte, context is 0 or 1
- lead byte (208 - 255): next byte is continuation byte, context is 2 or 3
The possible value combinations of the previous two bytes, the range of
context ids and the type of the next byte is summarized in the table below:
|--------\-----------------------------------------------------------------|
| \ Last byte |
| Second \---------------------------------------------------------------|
| last byte \ ASCII | cont. byte | lead byte |
| \ (0-127) | (128-191) | (192-) |
|=============|===================|=====================|==================|
| ASCII | next: ASCII/lead | not valid | next: cont. |
| (0-127) | context: 4 - 63 | | context: 2 - 3 |
|-------------|-------------------|---------------------|------------------|
| cont. byte | next: ASCII/lead | next: ASCII/lead | next: cont. |
| (128-191) | context: 4 - 63 | context: 0 - 1 | context: 2 - 3 |
|-------------|-------------------|---------------------|------------------|
| lead byte | not valid | next: ASCII/lead | not valid |
| (192-207) | | context: 0 - 1 | |
|-------------|-------------------|---------------------|------------------|
| lead byte | not valid | next: cont. | not valid |
| (208-) | | context: 2 - 3 | |
|-------------|-------------------|---------------------|------------------|
*/
#ifndef BROTLI_COMMON_CONTEXT_H_
#define BROTLI_COMMON_CONTEXT_H_
#include "platform.h"
typedef enum ContextType {
CONTEXT_LSB6 = 0,
CONTEXT_MSB6 = 1,
CONTEXT_UTF8 = 2,
CONTEXT_SIGNED = 3
} ContextType;
/* "Soft-private", it is exported, but not "advertised" as API. */
/* Common context lookup table for all context modes. */
BROTLI_COMMON_API extern const uint8_t _kBrotliContextLookupTable[2048];
typedef const uint8_t* ContextLut;
/* typeof(MODE) == ContextType; returns ContextLut */
#define BROTLI_CONTEXT_LUT(MODE) (&_kBrotliContextLookupTable[(MODE) << 9])
/* typeof(LUT) == ContextLut */
#define BROTLI_CONTEXT(P1, P2, LUT) ((LUT)[P1] | ((LUT) + 256)[P2])
#endif /* BROTLI_COMMON_CONTEXT_H_ */

432
c/common/dictionary.bin Normal file

File diff suppressed because one or more lines are too long

BIN
c/common/dictionary.bin.br Normal file

Binary file not shown.

64
c/common/dictionary.c Normal file
View File

@@ -0,0 +1,64 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "dictionary.h"
#include "platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#if !defined(BROTLI_EXTERNAL_DICTIONARY_DATA)
/* Embed kBrotliDictionaryData */
#include "dictionary_inc.h"
static const BROTLI_MODEL("small") BrotliDictionary kBrotliDictionary = {
#else
static BROTLI_MODEL("small") BrotliDictionary kBrotliDictionary = {
#endif
/* size_bits_by_length */
{
0, 0, 0, 0, 10, 10, 11, 11,
10, 10, 10, 10, 10, 9, 9, 8,
7, 7, 8, 7, 7, 6, 6, 5,
5, 0, 0, 0, 0, 0, 0, 0
},
/* offsets_by_length */
{
0, 0, 0, 0, 0, 4096, 9216, 21504,
35840, 44032, 53248, 63488, 74752, 87040, 93696, 100864,
104704, 106752, 108928, 113536, 115968, 118528, 119872, 121280,
122016, 122784, 122784, 122784, 122784, 122784, 122784, 122784
},
/* data_size == sizeof(kBrotliDictionaryData) */
122784,
/* data */
#if defined(BROTLI_EXTERNAL_DICTIONARY_DATA)
NULL
#else
kBrotliDictionaryData
#endif
};
const BrotliDictionary* BrotliGetDictionary(void) {
return &kBrotliDictionary;
}
void BrotliSetDictionaryData(const uint8_t* data) {
#if defined(BROTLI_EXTERNAL_DICTIONARY_DATA)
if (!!data && !kBrotliDictionary.data) {
kBrotliDictionary.data = data;
}
#else
BROTLI_UNUSED(data); // Appease -Werror=unused-parameter
#endif
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

63
c/common/dictionary.h Normal file
View File

@@ -0,0 +1,63 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Collection of static dictionary words. */
#ifndef BROTLI_COMMON_DICTIONARY_H_
#define BROTLI_COMMON_DICTIONARY_H_
#include "platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
typedef struct BrotliDictionary {
/**
* Number of bits to encode index of dictionary word in a bucket.
*
* Specification: Appendix A. Static Dictionary Data
*
* Words in a dictionary are bucketed by length.
* @c 0 means that there are no words of a given length.
* Dictionary consists of words with length of [4..24] bytes.
* Values at [0..3] and [25..31] indices should not be addressed.
*/
uint8_t size_bits_by_length[32];
/* assert(offset[i + 1] == offset[i] + (bits[i] ? (i << bits[i]) : 0)) */
uint32_t offsets_by_length[32];
/* assert(data_size == offsets_by_length[31]) */
size_t data_size;
/* Data array is not bound, and should obey to size_bits_by_length values.
Specified size matches default (RFC 7932) dictionary. Its size is
defined by data_size */
const uint8_t* data;
} BrotliDictionary;
BROTLI_COMMON_API const BrotliDictionary* BrotliGetDictionary(void);
/**
* Sets dictionary data.
*
* When dictionary data is already set / present, this method is no-op.
*
* Dictionary data MUST be provided before BrotliGetDictionary is invoked.
* This method is used ONLY in multi-client environment (e.g. C + Java),
* to reduce storage by sharing single dictionary between implementations.
*/
BROTLI_COMMON_API void BrotliSetDictionaryData(const uint8_t* data);
#define BROTLI_MIN_DICTIONARY_WORD_LENGTH 4
#define BROTLI_MAX_DICTIONARY_WORD_LENGTH 24
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_COMMON_DICTIONARY_H_ */

5847
c/common/dictionary_inc.h Normal file

File diff suppressed because it is too large Load Diff

19
c/common/platform.c Normal file
View File

@@ -0,0 +1,19 @@
/* Copyright 2016 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "platform.h"
/* Default brotli_alloc_func */
void* BrotliDefaultAllocFunc(void* opaque, size_t size) {
BROTLI_UNUSED(opaque);
return malloc(size);
}
/* Default brotli_free_func */
void BrotliDefaultFreeFunc(void* opaque, void* address) {
BROTLI_UNUSED(opaque);
free(address);
}

680
c/common/platform.h Normal file
View File

@@ -0,0 +1,680 @@
/* Copyright 2016 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Macros for compiler / platform specific features and build options.
Build options are:
* BROTLI_BUILD_32_BIT disables 64-bit optimizations
* BROTLI_BUILD_64_BIT forces to use 64-bit optimizations
* BROTLI_BUILD_BIG_ENDIAN forces to use big-endian optimizations
* BROTLI_BUILD_ENDIAN_NEUTRAL disables endian-aware optimizations
* BROTLI_BUILD_LITTLE_ENDIAN forces to use little-endian optimizations
* BROTLI_BUILD_NO_RBIT disables "rbit" optimization for ARM CPUs
* BROTLI_BUILD_NO_UNALIGNED_READ_FAST forces off the fast-unaligned-read
optimizations (mainly for testing purposes)
* BROTLI_DEBUG dumps file name and line number when decoder detects stream
or memory error
* BROTLI_ENABLE_LOG enables asserts and dumps various state information
* BROTLI_ENABLE_DUMP overrides default "dump" behaviour
*/
#ifndef BROTLI_COMMON_PLATFORM_H_
#define BROTLI_COMMON_PLATFORM_H_
#include <string.h> /* IWYU pragma: export memcmp, memcpy, memset */
#include <stdlib.h> /* IWYU pragma: export exit, free, malloc */
#include <sys/types.h> /* should include endian.h for us */
#include <brotli/port.h> /* IWYU pragma: export */
#include <brotli/types.h> /* IWYU pragma: export */
#if BROTLI_MSVC_VERSION_CHECK(18, 0, 0)
#include <intrin.h>
#endif
#if defined(BROTLI_ENABLE_LOG) || defined(BROTLI_DEBUG)
#include <assert.h>
#include <stdio.h>
#endif
/* The following macros were borrowed from https://github.com/nemequ/hedley
* with permission of original author - Evan Nemerson <evan@nemerson.com> */
/* >>> >>> >>> hedley macros */
/* Define "BROTLI_PREDICT_TRUE" and "BROTLI_PREDICT_FALSE" macros for capable
compilers.
To apply compiler hint, enclose the branching condition into macros, like this:
if (BROTLI_PREDICT_TRUE(zero == 0)) {
// main execution path
} else {
// compiler should place this code outside of main execution path
}
OR:
if (BROTLI_PREDICT_FALSE(something_rare_or_unexpected_happens)) {
// compiler should place this code outside of main execution path
}
*/
#if BROTLI_GNUC_HAS_BUILTIN(__builtin_expect, 3, 0, 0) || \
BROTLI_INTEL_VERSION_CHECK(16, 0, 0) || \
BROTLI_SUNPRO_VERSION_CHECK(5, 15, 0) || \
BROTLI_ARM_VERSION_CHECK(4, 1, 0) || \
BROTLI_IBM_VERSION_CHECK(10, 1, 0) || \
BROTLI_TI_VERSION_CHECK(7, 3, 0) || \
BROTLI_TINYC_VERSION_CHECK(0, 9, 27)
#define BROTLI_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
#define BROTLI_PREDICT_FALSE(x) (__builtin_expect(x, 0))
#else
#define BROTLI_PREDICT_FALSE(x) (x)
#define BROTLI_PREDICT_TRUE(x) (x)
#endif
#if defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L) && \
!defined(__cplusplus)
#define BROTLI_RESTRICT restrict
#elif BROTLI_GNUC_VERSION_CHECK(3, 1, 0) || \
BROTLI_MSVC_VERSION_CHECK(14, 0, 0) || \
BROTLI_INTEL_VERSION_CHECK(16, 0, 0) || \
BROTLI_ARM_VERSION_CHECK(4, 1, 0) || \
BROTLI_IBM_VERSION_CHECK(10, 1, 0) || \
BROTLI_PGI_VERSION_CHECK(17, 10, 0) || \
BROTLI_TI_VERSION_CHECK(8, 0, 0) || \
BROTLI_IAR_VERSION_CHECK(8, 0, 0) || \
(BROTLI_SUNPRO_VERSION_CHECK(5, 14, 0) && defined(__cplusplus))
#define BROTLI_RESTRICT __restrict
#elif BROTLI_SUNPRO_VERSION_CHECK(5, 3, 0) && !defined(__cplusplus)
#define BROTLI_RESTRICT _Restrict
#else
#define BROTLI_RESTRICT
#endif
#if (defined(__STDC_VERSION__) && (__STDC_VERSION__ >= 199901L)) || \
(defined(__cplusplus) && (__cplusplus >= 199711L))
#define BROTLI_MAYBE_INLINE inline
#elif defined(__GNUC_STDC_INLINE__) || defined(__GNUC_GNU_INLINE__) || \
BROTLI_ARM_VERSION_CHECK(6, 2, 0)
#define BROTLI_MAYBE_INLINE __inline__
#elif BROTLI_MSVC_VERSION_CHECK(12, 0, 0) || \
BROTLI_ARM_VERSION_CHECK(4, 1, 0) || BROTLI_TI_VERSION_CHECK(8, 0, 0)
#define BROTLI_MAYBE_INLINE __inline
#else
#define BROTLI_MAYBE_INLINE
#endif
#if BROTLI_GNUC_HAS_ATTRIBUTE(always_inline, 4, 0, 0) || \
BROTLI_INTEL_VERSION_CHECK(16, 0, 0) || \
BROTLI_SUNPRO_VERSION_CHECK(5, 11, 0) || \
BROTLI_ARM_VERSION_CHECK(4, 1, 0) || \
BROTLI_IBM_VERSION_CHECK(10, 1, 0) || \
BROTLI_TI_VERSION_CHECK(8, 0, 0) || \
(BROTLI_TI_VERSION_CHECK(7, 3, 0) && defined(__TI_GNU_ATTRIBUTE_SUPPORT__))
#define BROTLI_INLINE BROTLI_MAYBE_INLINE __attribute__((__always_inline__))
#elif BROTLI_MSVC_VERSION_CHECK(12, 0, 0)
#define BROTLI_INLINE BROTLI_MAYBE_INLINE __forceinline
#elif BROTLI_TI_VERSION_CHECK(7, 0, 0) && defined(__cplusplus)
#define BROTLI_INLINE BROTLI_MAYBE_INLINE _Pragma("FUNC_ALWAYS_INLINE;")
#elif BROTLI_IAR_VERSION_CHECK(8, 0, 0)
#define BROTLI_INLINE BROTLI_MAYBE_INLINE _Pragma("inline=forced")
#else
#define BROTLI_INLINE BROTLI_MAYBE_INLINE
#endif
#if BROTLI_GNUC_HAS_ATTRIBUTE(noinline, 4, 0, 0) || \
BROTLI_INTEL_VERSION_CHECK(16, 0, 0) || \
BROTLI_SUNPRO_VERSION_CHECK(5, 11, 0) || \
BROTLI_ARM_VERSION_CHECK(4, 1, 0) || \
BROTLI_IBM_VERSION_CHECK(10, 1, 0) || \
BROTLI_TI_VERSION_CHECK(8, 0, 0) || \
(BROTLI_TI_VERSION_CHECK(7, 3, 0) && defined(__TI_GNU_ATTRIBUTE_SUPPORT__))
#define BROTLI_NOINLINE __attribute__((__noinline__))
#elif BROTLI_MSVC_VERSION_CHECK(13, 10, 0)
#define BROTLI_NOINLINE __declspec(noinline)
#elif BROTLI_PGI_VERSION_CHECK(10, 2, 0)
#define BROTLI_NOINLINE _Pragma("noinline")
#elif BROTLI_TI_VERSION_CHECK(6, 0, 0) && defined(__cplusplus)
#define BROTLI_NOINLINE _Pragma("FUNC_CANNOT_INLINE;")
#elif BROTLI_IAR_VERSION_CHECK(8, 0, 0)
#define BROTLI_NOINLINE _Pragma("inline=never")
#else
#define BROTLI_NOINLINE
#endif
/* <<< <<< <<< end of hedley macros. */
#if BROTLI_GNUC_HAS_ATTRIBUTE(unused, 2, 7, 0) || \
BROTLI_INTEL_VERSION_CHECK(16, 0, 0)
#define BROTLI_UNUSED_FUNCTION static BROTLI_INLINE __attribute__ ((unused))
#else
#define BROTLI_UNUSED_FUNCTION static BROTLI_INLINE
#endif
#if BROTLI_GNUC_HAS_ATTRIBUTE(aligned, 2, 7, 0)
#define BROTLI_ALIGNED(N) __attribute__((aligned(N)))
#else
#define BROTLI_ALIGNED(N)
#endif
#if (defined(__ARM_ARCH) && (__ARM_ARCH == 7)) || \
(defined(M_ARM) && (M_ARM == 7))
#define BROTLI_TARGET_ARMV7
#endif /* ARMv7 */
#if (defined(__ARM_ARCH) && (__ARM_ARCH == 8)) || \
defined(__aarch64__) || defined(__ARM64_ARCH_8__)
#define BROTLI_TARGET_ARMV8_ANY
#if defined(__ARM_32BIT_STATE)
#define BROTLI_TARGET_ARMV8_32
#elif defined(__ARM_64BIT_STATE)
#define BROTLI_TARGET_ARMV8_64
#endif
#endif /* ARMv8 */
#if defined(__ARM_NEON__) || defined(__ARM_NEON)
#define BROTLI_TARGET_NEON
#endif
#if defined(__i386) || defined(_M_IX86)
#define BROTLI_TARGET_X86
#endif
#if defined(__x86_64__) || defined(_M_X64)
#define BROTLI_TARGET_X64
#endif
#if defined(__PPC64__)
#define BROTLI_TARGET_POWERPC64
#endif
#if defined(__riscv) && defined(__riscv_xlen) && __riscv_xlen == 64
#define BROTLI_TARGET_RISCV64
#endif
#if defined(__loongarch_lp64)
#define BROTLI_TARGET_LOONGARCH64
#endif
/* This does not seem to be an indicator of z/Architecture (64-bit); neither
that allows to use unaligned loads. */
#if defined(__s390x__)
#define BROTLI_TARGET_S390X
#endif
#if defined(__mips64)
#define BROTLI_TARGET_MIPS64
#endif
#if defined(BROTLI_TARGET_X64) || defined(BROTLI_TARGET_ARMV8_64) || \
defined(BROTLI_TARGET_POWERPC64) || defined(BROTLI_TARGET_RISCV64) || \
defined(BROTLI_TARGET_LOONGARCH64) || defined(BROTLI_TARGET_MIPS64)
#define BROTLI_TARGET_64_BITS 1
#else
#define BROTLI_TARGET_64_BITS 0
#endif
#if defined(BROTLI_BUILD_64_BIT)
#define BROTLI_64_BITS 1
#elif defined(BROTLI_BUILD_32_BIT)
#define BROTLI_64_BITS 0
#else
#define BROTLI_64_BITS BROTLI_TARGET_64_BITS
#endif
#if (BROTLI_64_BITS)
#define brotli_reg_t uint64_t
#else
#define brotli_reg_t uint32_t
#endif
#if defined(BROTLI_BUILD_BIG_ENDIAN)
#define BROTLI_BIG_ENDIAN 1
#elif defined(BROTLI_BUILD_LITTLE_ENDIAN)
#define BROTLI_LITTLE_ENDIAN 1
#elif defined(BROTLI_BUILD_ENDIAN_NEUTRAL)
/* Just break elif chain. */
#elif defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__)
#define BROTLI_LITTLE_ENDIAN 1
#elif defined(_WIN32) || defined(BROTLI_TARGET_X64)
/* Win32 & x64 can currently always be assumed to be little endian */
#define BROTLI_LITTLE_ENDIAN 1
#elif defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
#define BROTLI_BIG_ENDIAN 1
/* Likely target platform is iOS / OSX. */
#elif defined(BYTE_ORDER) && (BYTE_ORDER == LITTLE_ENDIAN)
#define BROTLI_LITTLE_ENDIAN 1
#elif defined(BYTE_ORDER) && (BYTE_ORDER == BIG_ENDIAN)
#define BROTLI_BIG_ENDIAN 1
#endif
#if !defined(BROTLI_LITTLE_ENDIAN)
#define BROTLI_LITTLE_ENDIAN 0
#endif
#if !defined(BROTLI_BIG_ENDIAN)
#define BROTLI_BIG_ENDIAN 0
#endif
#if defined(BROTLI_BUILD_NO_UNALIGNED_READ_FAST)
#define BROTLI_UNALIGNED_READ_FAST (!!0)
#elif defined(BROTLI_TARGET_X86) || defined(BROTLI_TARGET_X64) || \
defined(BROTLI_TARGET_ARMV7) || defined(BROTLI_TARGET_ARMV8_ANY) || \
defined(BROTLI_TARGET_RISCV64) || defined(BROTLI_TARGET_LOONGARCH64)
/* These targets are known to generate efficient code for unaligned reads
* (e.g. a single instruction, not multiple 1-byte loads, shifted and or'd
* together). */
#define BROTLI_UNALIGNED_READ_FAST (!!1)
#else
#define BROTLI_UNALIGNED_READ_FAST (!!0)
#endif
/* Portable unaligned memory access: read / write values via memcpy. */
#if !defined(BROTLI_USE_PACKED_FOR_UNALIGNED)
#if defined(__mips__) && (!defined(__mips_isa_rev) || __mips_isa_rev < 6)
#define BROTLI_USE_PACKED_FOR_UNALIGNED 1
#else
#define BROTLI_USE_PACKED_FOR_UNALIGNED 0
#endif
#endif /* defined(BROTLI_USE_PACKED_FOR_UNALIGNED) */
#if BROTLI_USE_PACKED_FOR_UNALIGNED
typedef union BrotliPackedValue {
uint16_t u16;
uint32_t u32;
uint64_t u64;
size_t szt;
} __attribute__ ((packed)) BrotliPackedValue;
static BROTLI_INLINE uint16_t BrotliUnalignedRead16(const void* p) {
const BrotliPackedValue* address = (const BrotliPackedValue*)p;
return address->u16;
}
static BROTLI_INLINE uint32_t BrotliUnalignedRead32(const void* p) {
const BrotliPackedValue* address = (const BrotliPackedValue*)p;
return address->u32;
}
static BROTLI_INLINE uint64_t BrotliUnalignedRead64(const void* p) {
const BrotliPackedValue* address = (const BrotliPackedValue*)p;
return address->u64;
}
static BROTLI_INLINE size_t BrotliUnalignedReadSizeT(const void* p) {
const BrotliPackedValue* address = (const BrotliPackedValue*)p;
return address->szt;
}
static BROTLI_INLINE void BrotliUnalignedWrite64(void* p, uint64_t v) {
BrotliPackedValue* address = (BrotliPackedValue*)p;
address->u64 = v;
}
#else /* not BROTLI_USE_PACKED_FOR_UNALIGNED */
static BROTLI_INLINE uint16_t BrotliUnalignedRead16(const void* p) {
uint16_t t;
memcpy(&t, p, sizeof t);
return t;
}
static BROTLI_INLINE uint32_t BrotliUnalignedRead32(const void* p) {
uint32_t t;
memcpy(&t, p, sizeof t);
return t;
}
static BROTLI_INLINE uint64_t BrotliUnalignedRead64(const void* p) {
uint64_t t;
memcpy(&t, p, sizeof t);
return t;
}
static BROTLI_INLINE size_t BrotliUnalignedReadSizeT(const void* p) {
size_t t;
memcpy(&t, p, sizeof t);
return t;
}
static BROTLI_INLINE void BrotliUnalignedWrite64(void* p, uint64_t v) {
memcpy(p, &v, sizeof v);
}
#endif /* BROTLI_USE_PACKED_FOR_UNALIGNED */
#if BROTLI_GNUC_HAS_BUILTIN(__builtin_bswap16, 4, 3, 0)
#define BROTLI_BSWAP16(V) ((uint16_t)__builtin_bswap16(V))
#else
#define BROTLI_BSWAP16(V) ((uint16_t)( \
(((V) & 0xFFU) << 8) | \
(((V) >> 8) & 0xFFU)))
#endif
#if BROTLI_GNUC_HAS_BUILTIN(__builtin_bswap32, 4, 3, 0)
#define BROTLI_BSWAP32(V) ((uint32_t)__builtin_bswap32(V))
#else
#define BROTLI_BSWAP32(V) ((uint32_t)( \
(((V) & 0xFFU) << 24) | (((V) & 0xFF00U) << 8) | \
(((V) >> 8) & 0xFF00U) | (((V) >> 24) & 0xFFU)))
#endif
#if BROTLI_GNUC_HAS_BUILTIN(__builtin_bswap64, 4, 3, 0)
#define BROTLI_BSWAP64(V) ((uint64_t)__builtin_bswap64(V))
#else
#define BROTLI_BSWAP64(V) ((uint64_t)( \
(((V) & 0xFFU) << 56) | (((V) & 0xFF00U) << 40) | \
(((V) & 0xFF0000U) << 24) | (((V) & 0xFF000000U) << 8) | \
(((V) >> 8) & 0xFF000000U) | (((V) >> 24) & 0xFF0000U) | \
(((V) >> 40) & 0xFF00U) | (((V) >> 56) & 0xFFU)))
#endif
#if BROTLI_LITTLE_ENDIAN
/* Straight endianness. Just read / write values. */
#define BROTLI_UNALIGNED_LOAD16LE BrotliUnalignedRead16
#define BROTLI_UNALIGNED_LOAD32LE BrotliUnalignedRead32
#define BROTLI_UNALIGNED_LOAD64LE BrotliUnalignedRead64
#define BROTLI_UNALIGNED_STORE64LE BrotliUnalignedWrite64
#elif BROTLI_BIG_ENDIAN /* BROTLI_LITTLE_ENDIAN */
static BROTLI_INLINE uint16_t BROTLI_UNALIGNED_LOAD16LE(const void* p) {
uint16_t value = BrotliUnalignedRead16(p);
return BROTLI_BSWAP16(value);
}
static BROTLI_INLINE uint32_t BROTLI_UNALIGNED_LOAD32LE(const void* p) {
uint32_t value = BrotliUnalignedRead32(p);
return BROTLI_BSWAP32(value);
}
static BROTLI_INLINE uint64_t BROTLI_UNALIGNED_LOAD64LE(const void* p) {
uint64_t value = BrotliUnalignedRead64(p);
return BROTLI_BSWAP64(value);
}
static BROTLI_INLINE void BROTLI_UNALIGNED_STORE64LE(void* p, uint64_t v) {
uint64_t value = BROTLI_BSWAP64(v);
BrotliUnalignedWrite64(p, value);
}
#else /* BROTLI_LITTLE_ENDIAN */
/* Read / store values byte-wise; hopefully compiler will understand. */
static BROTLI_INLINE uint16_t BROTLI_UNALIGNED_LOAD16LE(const void* p) {
const uint8_t* in = (const uint8_t*)p;
return (uint16_t)(in[0] | (in[1] << 8));
}
static BROTLI_INLINE uint32_t BROTLI_UNALIGNED_LOAD32LE(const void* p) {
const uint8_t* in = (const uint8_t*)p;
uint32_t value = (uint32_t)(in[0]);
value |= (uint32_t)(in[1]) << 8;
value |= (uint32_t)(in[2]) << 16;
value |= (uint32_t)(in[3]) << 24;
return value;
}
static BROTLI_INLINE uint64_t BROTLI_UNALIGNED_LOAD64LE(const void* p) {
const uint8_t* in = (const uint8_t*)p;
uint64_t value = (uint64_t)(in[0]);
value |= (uint64_t)(in[1]) << 8;
value |= (uint64_t)(in[2]) << 16;
value |= (uint64_t)(in[3]) << 24;
value |= (uint64_t)(in[4]) << 32;
value |= (uint64_t)(in[5]) << 40;
value |= (uint64_t)(in[6]) << 48;
value |= (uint64_t)(in[7]) << 56;
return value;
}
static BROTLI_INLINE void BROTLI_UNALIGNED_STORE64LE(void* p, uint64_t v) {
uint8_t* out = (uint8_t*)p;
out[0] = (uint8_t)v;
out[1] = (uint8_t)(v >> 8);
out[2] = (uint8_t)(v >> 16);
out[3] = (uint8_t)(v >> 24);
out[4] = (uint8_t)(v >> 32);
out[5] = (uint8_t)(v >> 40);
out[6] = (uint8_t)(v >> 48);
out[7] = (uint8_t)(v >> 56);
}
#endif /* BROTLI_LITTLE_ENDIAN */
static BROTLI_INLINE void* BROTLI_UNALIGNED_LOAD_PTR(const void* p) {
void* v;
memcpy(&v, p, sizeof(void*));
return v;
}
static BROTLI_INLINE void BROTLI_UNALIGNED_STORE_PTR(void* p, const void* v) {
memcpy(p, &v, sizeof(void*));
}
/* BROTLI_IS_CONSTANT macros returns true for compile-time constants. */
#if BROTLI_GNUC_HAS_BUILTIN(__builtin_constant_p, 3, 0, 1) || \
BROTLI_INTEL_VERSION_CHECK(16, 0, 0)
#define BROTLI_IS_CONSTANT(x) (!!__builtin_constant_p(x))
#else
#define BROTLI_IS_CONSTANT(x) (!!0)
#endif
#if defined(BROTLI_TARGET_ARMV7) || defined(BROTLI_TARGET_ARMV8_ANY)
#define BROTLI_HAS_UBFX (!!1)
#else
#define BROTLI_HAS_UBFX (!!0)
#endif
#if defined(BROTLI_ENABLE_LOG)
#define BROTLI_LOG(x) printf x
#else
#define BROTLI_LOG(x)
#endif
#if defined(BROTLI_DEBUG) || defined(BROTLI_ENABLE_LOG)
#define BROTLI_ENABLE_DUMP_DEFAULT 1
#define BROTLI_DCHECK(x) assert(x)
#else
#define BROTLI_ENABLE_DUMP_DEFAULT 0
#define BROTLI_DCHECK(x)
#endif
#if !defined(BROTLI_ENABLE_DUMP)
#define BROTLI_ENABLE_DUMP BROTLI_ENABLE_DUMP_DEFAULT
#endif
#if BROTLI_ENABLE_DUMP
static BROTLI_INLINE void BrotliDump(const char* f, int l, const char* fn) {
fprintf(stderr, "%s:%d (%s)\n", f, l, fn);
fflush(stderr);
}
#define BROTLI_DUMP() BrotliDump(__FILE__, __LINE__, __FUNCTION__)
#else
#define BROTLI_DUMP() (void)(0)
#endif
/* BrotliRBit assumes brotli_reg_t fits native CPU register type. */
#if (BROTLI_64_BITS == BROTLI_TARGET_64_BITS)
/* TODO(eustas): add appropriate icc/sunpro/arm/ibm/ti checks. */
#if (BROTLI_GNUC_VERSION_CHECK(3, 0, 0) || defined(__llvm__)) && \
!defined(BROTLI_BUILD_NO_RBIT)
#if defined(BROTLI_TARGET_ARMV7) || defined(BROTLI_TARGET_ARMV8_ANY)
/* TODO(eustas): detect ARMv6T2 and enable this code for it. */
static BROTLI_INLINE brotli_reg_t BrotliRBit(brotli_reg_t input) {
brotli_reg_t output;
__asm__("rbit %0, %1\n" : "=r"(output) : "r"(input));
return output;
}
#define BROTLI_RBIT(x) BrotliRBit(x)
#endif /* armv7 / armv8 */
#endif /* gcc || clang */
#endif /* brotli_reg_t is native */
#if !defined(BROTLI_RBIT)
static BROTLI_INLINE void BrotliRBit(void) { /* Should break build if used. */ }
#endif /* BROTLI_RBIT */
#define BROTLI_REPEAT_4(X) {X; X; X; X;}
#define BROTLI_REPEAT_5(X) {X; X; X; X; X;}
#define BROTLI_REPEAT_6(X) {X; X; X; X; X; X;}
#define BROTLI_UNUSED(X) (void)(X)
#define BROTLI_MIN_MAX(T) \
static BROTLI_INLINE T brotli_min_ ## T (T a, T b) { return a < b ? a : b; } \
static BROTLI_INLINE T brotli_max_ ## T (T a, T b) { return a > b ? a : b; }
BROTLI_MIN_MAX(double) BROTLI_MIN_MAX(float) BROTLI_MIN_MAX(int)
BROTLI_MIN_MAX(size_t) BROTLI_MIN_MAX(uint32_t) BROTLI_MIN_MAX(uint8_t)
#undef BROTLI_MIN_MAX
#define BROTLI_MIN(T, A, B) (brotli_min_ ## T((A), (B)))
#define BROTLI_MAX(T, A, B) (brotli_max_ ## T((A), (B)))
#define BROTLI_SWAP(T, A, I, J) { \
T __brotli_swap_tmp = (A)[(I)]; \
(A)[(I)] = (A)[(J)]; \
(A)[(J)] = __brotli_swap_tmp; \
}
#if BROTLI_64_BITS
#if BROTLI_GNUC_HAS_BUILTIN(__builtin_ctzll, 3, 4, 0) || \
BROTLI_INTEL_VERSION_CHECK(16, 0, 0)
#define BROTLI_TZCNT64 __builtin_ctzll
#elif BROTLI_MSVC_VERSION_CHECK(18, 0, 0)
#if defined(BROTLI_TARGET_X64) && !defined(_M_ARM64EC)
#define BROTLI_TZCNT64 _tzcnt_u64
#else /* BROTLI_TARGET_X64 */
static BROTLI_INLINE uint32_t BrotliBsf64Msvc(uint64_t x) {
uint32_t lsb;
_BitScanForward64(&lsb, x);
return lsb;
}
#define BROTLI_TZCNT64 BrotliBsf64Msvc
#endif /* BROTLI_TARGET_X64 */
#endif /* __builtin_ctzll */
#endif /* BROTLI_64_BITS */
#if BROTLI_GNUC_HAS_BUILTIN(__builtin_clz, 3, 4, 0) || \
BROTLI_INTEL_VERSION_CHECK(16, 0, 0)
#define BROTLI_BSR32(x) (31u ^ (uint32_t)__builtin_clz(x))
#elif BROTLI_MSVC_VERSION_CHECK(18, 0, 0)
static BROTLI_INLINE uint32_t BrotliBsr32Msvc(uint32_t x) {
unsigned long msb;
_BitScanReverse(&msb, x);
return (uint32_t)msb;
}
#define BROTLI_BSR32 BrotliBsr32Msvc
#endif /* __builtin_clz */
/* Default brotli_alloc_func */
BROTLI_COMMON_API void* BrotliDefaultAllocFunc(void* opaque, size_t size);
/* Default brotli_free_func */
BROTLI_COMMON_API void BrotliDefaultFreeFunc(void* opaque, void* address);
/* Circular logical rotates. */
static BROTLI_INLINE uint16_t BrotliRotateRight16(uint16_t const value,
size_t count) {
count &= 0x0F; /* for fickle pattern recognition */
return (value >> count) | (uint16_t)(value << ((0U - count) & 0x0F));
}
static BROTLI_INLINE uint32_t BrotliRotateRight32(uint32_t const value,
size_t count) {
count &= 0x1F; /* for fickle pattern recognition */
return (value >> count) | (uint32_t)(value << ((0U - count) & 0x1F));
}
static BROTLI_INLINE uint64_t BrotliRotateRight64(uint64_t const value,
size_t count) {
count &= 0x3F; /* for fickle pattern recognition */
return (value >> count) | (uint64_t)(value << ((0U - count) & 0x3F));
}
BROTLI_UNUSED_FUNCTION void BrotliSuppressUnusedFunctions(void) {
BROTLI_UNUSED(&BrotliSuppressUnusedFunctions);
BROTLI_UNUSED(&BrotliUnalignedRead16);
BROTLI_UNUSED(&BrotliUnalignedRead32);
BROTLI_UNUSED(&BrotliUnalignedRead64);
BROTLI_UNUSED(&BrotliUnalignedReadSizeT);
BROTLI_UNUSED(&BrotliUnalignedWrite64);
BROTLI_UNUSED(&BROTLI_UNALIGNED_LOAD16LE);
BROTLI_UNUSED(&BROTLI_UNALIGNED_LOAD32LE);
BROTLI_UNUSED(&BROTLI_UNALIGNED_LOAD64LE);
BROTLI_UNUSED(&BROTLI_UNALIGNED_STORE64LE);
BROTLI_UNUSED(&BROTLI_UNALIGNED_LOAD_PTR);
BROTLI_UNUSED(&BROTLI_UNALIGNED_STORE_PTR);
BROTLI_UNUSED(&BrotliRBit);
BROTLI_UNUSED(&brotli_min_double);
BROTLI_UNUSED(&brotli_max_double);
BROTLI_UNUSED(&brotli_min_float);
BROTLI_UNUSED(&brotli_max_float);
BROTLI_UNUSED(&brotli_min_int);
BROTLI_UNUSED(&brotli_max_int);
BROTLI_UNUSED(&brotli_min_size_t);
BROTLI_UNUSED(&brotli_max_size_t);
BROTLI_UNUSED(&brotli_min_uint32_t);
BROTLI_UNUSED(&brotli_max_uint32_t);
BROTLI_UNUSED(&brotli_min_uint8_t);
BROTLI_UNUSED(&brotli_max_uint8_t);
BROTLI_UNUSED(&BrotliDefaultAllocFunc);
BROTLI_UNUSED(&BrotliDefaultFreeFunc);
BROTLI_UNUSED(&BrotliRotateRight16);
BROTLI_UNUSED(&BrotliRotateRight32);
BROTLI_UNUSED(&BrotliRotateRight64);
#if BROTLI_ENABLE_DUMP
BROTLI_UNUSED(&BrotliDump);
#endif
#if defined(_MSC_VER) && (defined(_M_X64) || defined(_M_I86)) && \
!defined(_M_ARM64EC)
/* _mm_prefetch() is not defined outside of x86/x64 */
/* https://msdn.microsoft.com/fr-fr/library/84szxsww(v=vs.90).aspx */
#include <mmintrin.h>
#define PREFETCH_L1(ptr) _mm_prefetch((const char*)(ptr), _MM_HINT_T0)
#define PREFETCH_L2(ptr) _mm_prefetch((const char*)(ptr), _MM_HINT_T1)
#elif BROTLI_GNUC_HAS_BUILTIN(__builtin_prefetch, 3, 1, 0)
#define PREFETCH_L1(ptr) \
__builtin_prefetch((ptr), 0 /* rw==read */, 3 /* locality */)
#define PREFETCH_L2(ptr) \
__builtin_prefetch((ptr), 0 /* rw==read */, 2 /* locality */)
#elif defined(__aarch64__)
#define PREFETCH_L1(ptr) \
do { \
__asm__ __volatile__("prfm pldl1keep, %0" ::"Q"(*(ptr))); \
} while (0)
#define PREFETCH_L2(ptr) \
do { \
__asm__ __volatile__("prfm pldl2keep, %0" ::"Q"(*(ptr))); \
} while (0)
#else
#define PREFETCH_L1(ptr) \
do { \
(void)(ptr); \
} while (0) /* disabled */
#define PREFETCH_L2(ptr) \
do { \
(void)(ptr); \
} while (0) /* disabled */
#endif
/* The SIMD matchers are only faster at certain quality levels. */
#if defined(_M_X64) && defined(BROTLI_TZCNT64)
#define BROTLI_MAX_SIMD_QUALITY 7
#elif defined(BROTLI_TZCNT64)
#define BROTLI_MAX_SIMD_QUALITY 6
#endif
}
#if defined(_MSC_VER)
#define BROTLI_CRASH() __debugbreak(), (void)abort()
#elif BROTLI_GNUC_HAS_BUILTIN(__builtin_trap, 3, 0, 0)
#define BROTLI_CRASH() (void)__builtin_trap()
#else
#define BROTLI_CRASH() (void)abort()
#endif
/* Make BROTLI_TEST=0 act same as undefined. */
#if defined(BROTLI_TEST) && ((1-BROTLI_TEST-1) == 0)
#undef BROTLI_TEST
#endif
#if BROTLI_GNUC_HAS_ATTRIBUTE(model, 3, 0, 3)
#define BROTLI_MODEL(M) __attribute__((model(M)))
#else
#define BROTLI_MODEL(M) /* M */
#endif
#if BROTLI_GNUC_HAS_ATTRIBUTE(cold, 4, 3, 0)
#define BROTLI_COLD __attribute__((cold))
#else
#define BROTLI_COLD /* cold */
#endif
#endif /* BROTLI_COMMON_PLATFORM_H_ */

View File

@@ -0,0 +1,517 @@
/* Copyright 2017 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Shared Dictionary definition and utilities. */
#include <brotli/shared_dictionary.h>
#include "dictionary.h"
#include "platform.h"
#include "shared_dictionary_internal.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#if defined(BROTLI_EXPERIMENTAL)
#define BROTLI_NUM_ENCODED_LENGTHS (SHARED_BROTLI_MAX_DICTIONARY_WORD_LENGTH \
- SHARED_BROTLI_MIN_DICTIONARY_WORD_LENGTH + 1)
/* Max allowed by spec */
#define BROTLI_MAX_SIZE_BITS 15u
/* Returns BROTLI_TRUE on success, BROTLI_FALSE on failure. */
static BROTLI_BOOL ReadBool(const uint8_t* encoded, size_t size, size_t* pos,
BROTLI_BOOL* result) {
uint8_t value;
size_t position = *pos;
if (position >= size) return BROTLI_FALSE; /* past file end */
value = encoded[position++];
if (value > 1) return BROTLI_FALSE; /* invalid bool */
*result = TO_BROTLI_BOOL(value);
*pos = position;
return BROTLI_TRUE; /* success */
}
/* Returns BROTLI_TRUE on success, BROTLI_FALSE on failure. */
static BROTLI_BOOL ReadUint8(const uint8_t* encoded, size_t size, size_t* pos,
uint8_t* result) {
size_t position = *pos;
if (position + sizeof(uint8_t) > size) return BROTLI_FALSE;
*result = encoded[position++];
*pos = position;
return BROTLI_TRUE;
}
/* Returns BROTLI_TRUE on success, BROTLI_FALSE on failure. */
static BROTLI_BOOL ReadUint16(const uint8_t* encoded, size_t size, size_t* pos,
uint16_t* result) {
size_t position = *pos;
if (position + sizeof(uint16_t) > size) return BROTLI_FALSE;
*result = BROTLI_UNALIGNED_LOAD16LE(&encoded[position]);
position += 2;
*pos = position;
return BROTLI_TRUE;
}
/* Reads a varint into a uint32_t, and returns error if it's too large */
/* Returns BROTLI_TRUE on success, BROTLI_FALSE on failure. */
static BROTLI_BOOL ReadVarint32(const uint8_t* encoded, size_t size,
size_t* pos, uint32_t* result) {
int num = 0;
uint8_t byte;
*result = 0;
for (;;) {
if (*pos >= size) return BROTLI_FALSE;
byte = encoded[(*pos)++];
if (num == 4 && byte > 15) return BROTLI_FALSE;
*result |= (uint32_t)(byte & 127) << (num * 7);
if (byte < 128) return BROTLI_TRUE;
num++;
}
}
/* Returns the total length of word list. */
static size_t BrotliSizeBitsToOffsets(const uint8_t* size_bits_by_length,
uint32_t* offsets_by_length) {
uint32_t pos = 0;
uint32_t i;
for (i = 0; i <= SHARED_BROTLI_MAX_DICTIONARY_WORD_LENGTH; i++) {
offsets_by_length[i] = pos;
if (size_bits_by_length[i] != 0) {
pos += i << size_bits_by_length[i];
}
}
return pos;
}
static BROTLI_BOOL ParseWordList(size_t size, const uint8_t* encoded,
size_t* pos, BrotliDictionary* out) {
size_t offset;
size_t i;
size_t position = *pos;
if (position + BROTLI_NUM_ENCODED_LENGTHS > size) {
return BROTLI_FALSE;
}
memset(out->size_bits_by_length, 0, SHARED_BROTLI_MIN_DICTIONARY_WORD_LENGTH);
memcpy(out->size_bits_by_length + SHARED_BROTLI_MIN_DICTIONARY_WORD_LENGTH,
&encoded[position], BROTLI_NUM_ENCODED_LENGTHS);
for (i = SHARED_BROTLI_MIN_DICTIONARY_WORD_LENGTH;
i <= SHARED_BROTLI_MAX_DICTIONARY_WORD_LENGTH; i++) {
if (out->size_bits_by_length[i] > BROTLI_MAX_SIZE_BITS) {
return BROTLI_FALSE;
}
}
position += BROTLI_NUM_ENCODED_LENGTHS;
offset = BrotliSizeBitsToOffsets(
out->size_bits_by_length, out->offsets_by_length);
out->data = &encoded[position];
out->data_size = offset;
position += offset;
if (position > size) return BROTLI_FALSE;
*pos = position;
return BROTLI_TRUE;
}
/* Computes the cutOffTransforms of a BrotliTransforms which already has the
transforms data correctly filled in. */
static void ComputeCutoffTransforms(BrotliTransforms* transforms) {
uint32_t i;
for (i = 0; i < BROTLI_TRANSFORMS_MAX_CUT_OFF + 1; i++) {
transforms->cutOffTransforms[i] = -1;
}
for (i = 0; i < transforms->num_transforms; i++) {
const uint8_t* prefix = BROTLI_TRANSFORM_PREFIX(transforms, i);
uint8_t type = BROTLI_TRANSFORM_TYPE(transforms, i);
const uint8_t* suffix = BROTLI_TRANSFORM_SUFFIX(transforms, i);
if (type <= BROTLI_TRANSFORM_OMIT_LAST_9 && *prefix == 0 && *suffix == 0 &&
transforms->cutOffTransforms[type] == -1) {
transforms->cutOffTransforms[type] = (int16_t)i;
}
}
}
static BROTLI_BOOL ParsePrefixSuffixTable(size_t size, const uint8_t* encoded,
size_t* pos, BrotliTransforms* out, uint16_t* out_table,
size_t* out_table_size) {
size_t position = *pos;
size_t offset = 0;
size_t stringlet_count = 0; /* NUM_PREFIX_SUFFIX */
size_t data_length = 0;
/* PREFIX_SUFFIX_LENGTH */
if (!ReadUint16(encoded, size, &position, &out->prefix_suffix_size)) {
return BROTLI_FALSE;
}
data_length = out->prefix_suffix_size;
/* Must at least have space for null terminator. */
if (data_length < 1) return BROTLI_FALSE;
out->prefix_suffix = &encoded[position];
if (position + data_length >= size) return BROTLI_FALSE;
while (BROTLI_TRUE) {
/* STRING_LENGTH */
size_t stringlet_len = encoded[position + offset];
out_table[stringlet_count] = (uint16_t)offset;
stringlet_count++;
offset++;
if (stringlet_len == 0) {
if (offset == data_length) {
break;
} else {
return BROTLI_FALSE;
}
}
if (stringlet_count > 255) return BROTLI_FALSE;
offset += stringlet_len;
if (offset >= data_length) return BROTLI_FALSE;
}
position += data_length;
*pos = position;
*out_table_size = (uint16_t)stringlet_count;
return BROTLI_TRUE;
}
static BROTLI_BOOL ParseTransformsList(size_t size, const uint8_t* encoded,
size_t* pos, BrotliTransforms* out, uint16_t* prefix_suffix_table,
size_t* prefix_suffix_count) {
uint32_t i;
BROTLI_BOOL has_params = BROTLI_FALSE;
BROTLI_BOOL prefix_suffix_ok = BROTLI_FALSE;
size_t position = *pos;
size_t stringlet_cnt = 0;
if (position >= size) return BROTLI_FALSE;
prefix_suffix_ok = ParsePrefixSuffixTable(
size, encoded, &position, out, prefix_suffix_table, &stringlet_cnt);
if (!prefix_suffix_ok) return BROTLI_FALSE;
out->prefix_suffix_map = prefix_suffix_table;
*prefix_suffix_count = stringlet_cnt;
out->num_transforms = encoded[position++];
out->transforms = &encoded[position];
position += (size_t)out->num_transforms * 3;
if (position > size) return BROTLI_FALSE;
/* Check for errors and read extra parameters. */
for (i = 0; i < out->num_transforms; i++) {
uint8_t prefix_id = BROTLI_TRANSFORM_PREFIX_ID(out, i);
uint8_t type = BROTLI_TRANSFORM_TYPE(out, i);
uint8_t suffix_id = BROTLI_TRANSFORM_SUFFIX_ID(out, i);
if (prefix_id >= stringlet_cnt) return BROTLI_FALSE;
if (type >= BROTLI_NUM_TRANSFORM_TYPES) return BROTLI_FALSE;
if (suffix_id >= stringlet_cnt) return BROTLI_FALSE;
if (type == BROTLI_TRANSFORM_SHIFT_FIRST ||
type == BROTLI_TRANSFORM_SHIFT_ALL) {
has_params = BROTLI_TRUE;
}
}
if (has_params) {
out->params = &encoded[position];
position += (size_t)out->num_transforms * 2;
if (position > size) return BROTLI_FALSE;
for (i = 0; i < out->num_transforms; i++) {
uint8_t type = BROTLI_TRANSFORM_TYPE(out, i);
if (type != BROTLI_TRANSFORM_SHIFT_FIRST &&
type != BROTLI_TRANSFORM_SHIFT_ALL) {
if (out->params[i * 2] != 0 || out->params[i * 2 + 1] != 0) {
return BROTLI_FALSE;
}
}
}
} else {
out->params = NULL;
}
ComputeCutoffTransforms(out);
*pos = position;
return BROTLI_TRUE;
}
static BROTLI_BOOL DryParseDictionary(const uint8_t* encoded,
size_t size, uint32_t* num_prefix, BROTLI_BOOL* is_custom_static_dict) {
size_t pos = 0;
uint32_t chunk_size = 0;
uint8_t num_word_lists;
uint8_t num_transform_lists;
*is_custom_static_dict = BROTLI_FALSE;
*num_prefix = 0;
/* Skip magic header bytes. */
pos += 2;
/* LZ77_DICTIONARY_LENGTH */
if (!ReadVarint32(encoded, size, &pos, &chunk_size)) return BROTLI_FALSE;
if (chunk_size != 0) {
/* This limitation is not specified but the 32-bit Brotli decoder for now */
if (chunk_size > 1073741823) return BROTLI_FALSE;
*num_prefix = 1;
if (pos + chunk_size > size) return BROTLI_FALSE;
pos += chunk_size;
}
if (!ReadUint8(encoded, size, &pos, &num_word_lists)) {
return BROTLI_FALSE;
}
if (!ReadUint8(encoded, size, &pos, &num_transform_lists)) {
return BROTLI_FALSE;
}
if (num_word_lists > 0 || num_transform_lists > 0) {
*is_custom_static_dict = BROTLI_TRUE;
}
return BROTLI_TRUE;
}
static BROTLI_BOOL ParseDictionary(const uint8_t* encoded, size_t size,
BrotliSharedDictionary* dict) {
uint32_t i;
size_t pos = 0;
uint32_t chunk_size = 0;
size_t total_prefix_suffix_count = 0;
size_t transform_list_start[SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS];
uint16_t temporary_prefix_suffix_table[256];
/* Skip magic header bytes. */
pos += 2;
/* LZ77_DICTIONARY_LENGTH */
if (!ReadVarint32(encoded, size, &pos, &chunk_size)) return BROTLI_FALSE;
if (chunk_size != 0) {
if (pos + chunk_size > size) return BROTLI_FALSE;
dict->prefix_size[dict->num_prefix] = chunk_size;
dict->prefix[dict->num_prefix] = &encoded[pos];
dict->num_prefix++;
/* LZ77_DICTIONARY_LENGTH bytes. */
pos += chunk_size;
}
/* NUM_WORD_LISTS */
if (!ReadUint8(encoded, size, &pos, &dict->num_word_lists)) {
return BROTLI_FALSE;
}
if (dict->num_word_lists > SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS) {
return BROTLI_FALSE;
}
if (dict->num_word_lists != 0) {
dict->words_instances = (BrotliDictionary*)dict->alloc_func(
dict->memory_manager_opaque,
dict->num_word_lists * sizeof(*dict->words_instances));
if (!dict->words_instances) return BROTLI_FALSE; /* OOM */
}
for (i = 0; i < dict->num_word_lists; i++) {
if (!ParseWordList(size, encoded, &pos, &dict->words_instances[i])) {
return BROTLI_FALSE;
}
}
/* NUM_TRANSFORM_LISTS */
if (!ReadUint8(encoded, size, &pos, &dict->num_transform_lists)) {
return BROTLI_FALSE;
}
if (dict->num_transform_lists > SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS) {
return BROTLI_FALSE;
}
if (dict->num_transform_lists != 0) {
dict->transforms_instances = (BrotliTransforms*)dict->alloc_func(
dict->memory_manager_opaque,
dict->num_transform_lists * sizeof(*dict->transforms_instances));
if (!dict->transforms_instances) return BROTLI_FALSE; /* OOM */
}
for (i = 0; i < dict->num_transform_lists; i++) {
BROTLI_BOOL ok = BROTLI_FALSE;
size_t prefix_suffix_count = 0;
transform_list_start[i] = pos;
dict->transforms_instances[i].prefix_suffix_map =
temporary_prefix_suffix_table;
ok = ParseTransformsList(
size, encoded, &pos, &dict->transforms_instances[i],
temporary_prefix_suffix_table, &prefix_suffix_count);
if (!ok) return BROTLI_FALSE;
total_prefix_suffix_count += prefix_suffix_count;
}
if (total_prefix_suffix_count != 0) {
dict->prefix_suffix_maps = (uint16_t*)dict->alloc_func(
dict->memory_manager_opaque,
total_prefix_suffix_count * sizeof(*dict->prefix_suffix_maps));
if (!dict->prefix_suffix_maps) return BROTLI_FALSE; /* OOM */
}
total_prefix_suffix_count = 0;
for (i = 0; i < dict->num_transform_lists; i++) {
size_t prefix_suffix_count = 0;
size_t position = transform_list_start[i];
uint16_t* prefix_suffix_map =
&dict->prefix_suffix_maps[total_prefix_suffix_count];
BROTLI_BOOL ok = ParsePrefixSuffixTable(
size, encoded, &position, &dict->transforms_instances[i],
prefix_suffix_map, &prefix_suffix_count);
if (!ok) return BROTLI_FALSE;
dict->transforms_instances[i].prefix_suffix_map = prefix_suffix_map;
total_prefix_suffix_count += prefix_suffix_count;
}
if (dict->num_word_lists != 0 || dict->num_transform_lists != 0) {
if (!ReadUint8(encoded, size, &pos, &dict->num_dictionaries)) {
return BROTLI_FALSE;
}
if (dict->num_dictionaries == 0 ||
dict->num_dictionaries > SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS) {
return BROTLI_FALSE;
}
for (i = 0; i < dict->num_dictionaries; i++) {
uint8_t words_index;
uint8_t transforms_index;
if (!ReadUint8(encoded, size, &pos, &words_index)) {
return BROTLI_FALSE;
}
if (words_index > dict->num_word_lists) return BROTLI_FALSE;
if (!ReadUint8(encoded, size, &pos, &transforms_index)) {
return BROTLI_FALSE;
}
if (transforms_index > dict->num_transform_lists) return BROTLI_FALSE;
dict->words[i] = words_index == dict->num_word_lists ?
BrotliGetDictionary() : &dict->words_instances[words_index];
dict->transforms[i] = transforms_index == dict->num_transform_lists ?
BrotliGetTransforms(): &dict->transforms_instances[transforms_index];
}
/* CONTEXT_ENABLED */
if (!ReadBool(encoded, size, &pos, &dict->context_based)) {
return BROTLI_FALSE;
}
/* CONTEXT_MAP */
if (dict->context_based) {
for (i = 0; i < SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS; i++) {
if (!ReadUint8(encoded, size, &pos, &dict->context_map[i])) {
return BROTLI_FALSE;
}
if (dict->context_map[i] >= dict->num_dictionaries) {
return BROTLI_FALSE;
}
}
}
} else {
dict->context_based = BROTLI_FALSE;
dict->num_dictionaries = 1;
dict->words[0] = BrotliGetDictionary();
dict->transforms[0] = BrotliGetTransforms();
}
return BROTLI_TRUE;
}
/* Decodes shared dictionary and verifies correctness.
Returns BROTLI_TRUE if dictionary is valid, BROTLI_FALSE otherwise.
The BrotliSharedDictionary must already have been initialized. If the
BrotliSharedDictionary already contains data, compound dictionaries
will be appended, but an error will be returned if it already has
custom words or transforms.
TODO(lode): link to RFC for shared brotli once published. */
static BROTLI_BOOL DecodeSharedDictionary(
const uint8_t* encoded, size_t size, BrotliSharedDictionary* dict) {
uint32_t num_prefix = 0;
BROTLI_BOOL is_custom_static_dict = BROTLI_FALSE;
BROTLI_BOOL has_custom_static_dict =
dict->num_word_lists > 0 || dict->num_transform_lists > 0;
/* Check magic header bytes. */
if (size < 2) return BROTLI_FALSE;
if (encoded[0] != 0x91 || encoded[1] != 0) return BROTLI_FALSE;
if (!DryParseDictionary(encoded, size, &num_prefix, &is_custom_static_dict)) {
return BROTLI_FALSE;
}
if (num_prefix + dict->num_prefix > SHARED_BROTLI_MAX_COMPOUND_DICTS) {
return BROTLI_FALSE;
}
/* Cannot combine different static dictionaries, only prefix dictionaries */
if (has_custom_static_dict && is_custom_static_dict) return BROTLI_FALSE;
return ParseDictionary(encoded, size, dict);
}
#endif /* BROTLI_EXPERIMENTAL */
void BrotliSharedDictionaryDestroyInstance(
BrotliSharedDictionary* dict) {
if (!dict) {
return;
} else {
brotli_free_func free_func = dict->free_func;
void* opaque = dict->memory_manager_opaque;
/* Cleanup. */
free_func(opaque, dict->words_instances);
free_func(opaque, dict->transforms_instances);
free_func(opaque, dict->prefix_suffix_maps);
/* Self-destruction. */
free_func(opaque, dict);
}
}
BROTLI_BOOL BrotliSharedDictionaryAttach(
BrotliSharedDictionary* dict, BrotliSharedDictionaryType type,
size_t data_size, const uint8_t data[BROTLI_ARRAY_PARAM(data_size)]) {
if (!dict) {
return BROTLI_FALSE;
}
#if defined(BROTLI_EXPERIMENTAL)
if (type == BROTLI_SHARED_DICTIONARY_SERIALIZED) {
return DecodeSharedDictionary(data, data_size, dict);
}
#endif /* BROTLI_EXPERIMENTAL */
if (type == BROTLI_SHARED_DICTIONARY_RAW) {
if (dict->num_prefix >= SHARED_BROTLI_MAX_COMPOUND_DICTS) {
return BROTLI_FALSE;
}
dict->prefix_size[dict->num_prefix] = data_size;
dict->prefix[dict->num_prefix] = data;
dict->num_prefix++;
return BROTLI_TRUE;
}
return BROTLI_FALSE;
}
BrotliSharedDictionary* BrotliSharedDictionaryCreateInstance(
brotli_alloc_func alloc_func, brotli_free_func free_func, void* opaque) {
BrotliSharedDictionary* dict = 0;
if (!alloc_func && !free_func) {
dict = (BrotliSharedDictionary*)malloc(sizeof(BrotliSharedDictionary));
} else if (alloc_func && free_func) {
dict = (BrotliSharedDictionary*)alloc_func(
opaque, sizeof(BrotliSharedDictionary));
}
if (dict == 0) {
return 0;
}
/* TODO(eustas): explicitly initialize all the fields? */
memset(dict, 0, sizeof(BrotliSharedDictionary));
dict->context_based = BROTLI_FALSE;
dict->num_dictionaries = 1;
dict->num_word_lists = 0;
dict->num_transform_lists = 0;
dict->words[0] = BrotliGetDictionary();
dict->transforms[0] = BrotliGetTransforms();
dict->alloc_func = alloc_func ? alloc_func : BrotliDefaultAllocFunc;
dict->free_func = free_func ? free_func : BrotliDefaultFreeFunc;
dict->memory_manager_opaque = opaque;
return dict;
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

View File

@@ -0,0 +1,75 @@
/* Copyright 2017 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* (Transparent) Shared Dictionary definition. */
#ifndef BROTLI_COMMON_SHARED_DICTIONARY_INTERNAL_H_
#define BROTLI_COMMON_SHARED_DICTIONARY_INTERNAL_H_
#include <brotli/shared_dictionary.h>
#include "dictionary.h"
#include "platform.h"
#include "transform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
struct BrotliSharedDictionaryStruct {
/* LZ77 prefixes (compound dictionary). */
uint32_t num_prefix; /* max SHARED_BROTLI_MAX_COMPOUND_DICTS */
size_t prefix_size[SHARED_BROTLI_MAX_COMPOUND_DICTS];
const uint8_t* prefix[SHARED_BROTLI_MAX_COMPOUND_DICTS];
/* If set, the context map is used to select word and transform list from 64
contexts, if not set, the context map is not used and only words[0] and
transforms[0] are to be used. */
BROTLI_BOOL context_based;
uint8_t context_map[SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS];
/* Amount of word_list+transform_list combinations. */
uint8_t num_dictionaries;
/* Must use num_dictionaries values. */
const BrotliDictionary* words[SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS];
/* Must use num_dictionaries values. */
const BrotliTransforms* transforms[SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS];
/* Amount of custom word lists. May be 0 if only Brotli's built-in is used */
uint8_t num_word_lists;
/* Contents of the custom words lists. Must be NULL if num_word_lists is 0. */
BrotliDictionary* words_instances;
/* Amount of custom transform lists. May be 0 if only Brotli's built-in is
used */
uint8_t num_transform_lists;
/* Contents of the custom transform lists. Must be NULL if num_transform_lists
is 0. */
BrotliTransforms* transforms_instances;
/* Concatenated prefix_suffix_maps of the custom transform lists. Must be NULL
if num_transform_lists is 0. */
uint16_t* prefix_suffix_maps;
/* Memory management */
brotli_alloc_func alloc_func;
brotli_free_func free_func;
void* memory_manager_opaque;
};
typedef struct BrotliSharedDictionaryStruct BrotliSharedDictionaryInternal;
#define BrotliSharedDictionary BrotliSharedDictionaryInternal
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_COMMON_SHARED_DICTIONARY_INTERNAL_H_ */

56
c/common/static_init.h Normal file
View File

@@ -0,0 +1,56 @@
/* Copyright 2025 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/*
Central point for static initialization.
In case of "lazy" mode `BrotliXxxLazyStaticInit` is not provided by the
library. Embedder is responsible for providing it. This function should call
`BrotliXxxLazyStaticInitInner` on the first invocation. This function should
not return until execution of `BrotliXxxLazyStaticInitInner` is finished.
In C or before C++11 it is possible to call `BrotliXxxLazyStaticInitInner`
on start-up path and then `BrotliEncoderLazyStaticInit` is could be no-op;
another option is to use available thread execution controls to meet the
requirements. For possible C++11 implementation see static_init_lazy.cc.
*/
#ifndef THIRD_PARTY_BROTLI_COMMON_STATIC_INIT_H_
#define THIRD_PARTY_BROTLI_COMMON_STATIC_INIT_H_
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* Static data is "initialized" in compile time. */
#define BROTLI_STATIC_INIT_NONE 0
/* Static data is initialized before "main". */
#define BROTLI_STATIC_INIT_EARLY 1
/* Static data is initialized when first encoder is created. */
#define BROTLI_STATIC_INIT_LAZY 2
#define BROTLI_STATIC_INIT_DEFAULT BROTLI_STATIC_INIT_NONE
#if !defined(BROTLI_STATIC_INIT)
#define BROTLI_STATIC_INIT BROTLI_STATIC_INIT_DEFAULT
#endif
#if (BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_NONE) && \
(BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_EARLY) && \
(BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_LAZY)
#error Invalid value for BROTLI_STATIC_INIT
#endif
#if (BROTLI_STATIC_INIT == BROTLI_STATIC_INIT_EARLY)
#if defined(BROTLI_EXTERNAL_DICTIONARY_DATA)
#error BROTLI_STATIC_INIT_EARLY will fail with BROTLI_EXTERNAL_DICTIONARY_DATA
#endif
#endif /* BROTLI_STATIC_INIT */
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif // THIRD_PARTY_BROTLI_COMMON_STATIC_INIT_H_

293
c/common/transform.c Normal file
View File

@@ -0,0 +1,293 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "platform.h"
#include "transform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* RFC 7932 transforms string data */
static const BROTLI_MODEL("small") char kPrefixSuffix[217] =
"\1 \2, \10 of the \4 of \2s \1.\5 and \4 "
/* 0x _0 _2 __5 _E _3 _6 _8 _E */
"in \1\"\4 to \2\">\1\n\2. \1]\5 for \3 a \6 "
/* 2x _3_ _5 _A_ _D_ _F _2 _4 _A _E */
"that \1\'\6 with \6 from \4 by \1(\6. T"
/* 4x _5_ _7 _E _5 _A _C */
"he \4 on \4 as \4 is \4ing \2\n\t\1:\3ed "
/* 6x _3 _8 _D _2 _7_ _ _A _C */
"\2=\"\4 at \3ly \1,\2=\'\5.com/\7. This \5"
/* 8x _0 _ _3 _8 _C _E _ _1 _7 _F */
" not \3er \3al \4ful \4ive \5less \4es"
/* Ax _5 _9 _D _2 _7 _D */
"t \4ize \2\xc2\xa0\4ous \5 the \2e "; /* \0 - implicit trailing zero. */
/* Cx _2 _7___ ___ _A _F _5 _8 */
static const BROTLI_MODEL("small") uint16_t kPrefixSuffixMap[50] = {
0x00, 0x02, 0x05, 0x0E, 0x13, 0x16, 0x18, 0x1E, 0x23, 0x25,
0x2A, 0x2D, 0x2F, 0x32, 0x34, 0x3A, 0x3E, 0x45, 0x47, 0x4E,
0x55, 0x5A, 0x5C, 0x63, 0x68, 0x6D, 0x72, 0x77, 0x7A, 0x7C,
0x80, 0x83, 0x88, 0x8C, 0x8E, 0x91, 0x97, 0x9F, 0xA5, 0xA9,
0xAD, 0xB2, 0xB7, 0xBD, 0xC2, 0xC7, 0xCA, 0xCF, 0xD5, 0xD8
};
/* RFC 7932 transforms */
static const BROTLI_MODEL("small") uint8_t kTransformsData[] = {
49, BROTLI_TRANSFORM_IDENTITY, 49,
49, BROTLI_TRANSFORM_IDENTITY, 0,
0, BROTLI_TRANSFORM_IDENTITY, 0,
49, BROTLI_TRANSFORM_OMIT_FIRST_1, 49,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 0,
49, BROTLI_TRANSFORM_IDENTITY, 47,
0, BROTLI_TRANSFORM_IDENTITY, 49,
4, BROTLI_TRANSFORM_IDENTITY, 0,
49, BROTLI_TRANSFORM_IDENTITY, 3,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 49,
49, BROTLI_TRANSFORM_IDENTITY, 6,
49, BROTLI_TRANSFORM_OMIT_FIRST_2, 49,
49, BROTLI_TRANSFORM_OMIT_LAST_1, 49,
1, BROTLI_TRANSFORM_IDENTITY, 0,
49, BROTLI_TRANSFORM_IDENTITY, 1,
0, BROTLI_TRANSFORM_UPPERCASE_FIRST, 0,
49, BROTLI_TRANSFORM_IDENTITY, 7,
49, BROTLI_TRANSFORM_IDENTITY, 9,
48, BROTLI_TRANSFORM_IDENTITY, 0,
49, BROTLI_TRANSFORM_IDENTITY, 8,
49, BROTLI_TRANSFORM_IDENTITY, 5,
49, BROTLI_TRANSFORM_IDENTITY, 10,
49, BROTLI_TRANSFORM_IDENTITY, 11,
49, BROTLI_TRANSFORM_OMIT_LAST_3, 49,
49, BROTLI_TRANSFORM_IDENTITY, 13,
49, BROTLI_TRANSFORM_IDENTITY, 14,
49, BROTLI_TRANSFORM_OMIT_FIRST_3, 49,
49, BROTLI_TRANSFORM_OMIT_LAST_2, 49,
49, BROTLI_TRANSFORM_IDENTITY, 15,
49, BROTLI_TRANSFORM_IDENTITY, 16,
0, BROTLI_TRANSFORM_UPPERCASE_FIRST, 49,
49, BROTLI_TRANSFORM_IDENTITY, 12,
5, BROTLI_TRANSFORM_IDENTITY, 49,
0, BROTLI_TRANSFORM_IDENTITY, 1,
49, BROTLI_TRANSFORM_OMIT_FIRST_4, 49,
49, BROTLI_TRANSFORM_IDENTITY, 18,
49, BROTLI_TRANSFORM_IDENTITY, 17,
49, BROTLI_TRANSFORM_IDENTITY, 19,
49, BROTLI_TRANSFORM_IDENTITY, 20,
49, BROTLI_TRANSFORM_OMIT_FIRST_5, 49,
49, BROTLI_TRANSFORM_OMIT_FIRST_6, 49,
47, BROTLI_TRANSFORM_IDENTITY, 49,
49, BROTLI_TRANSFORM_OMIT_LAST_4, 49,
49, BROTLI_TRANSFORM_IDENTITY, 22,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 49,
49, BROTLI_TRANSFORM_IDENTITY, 23,
49, BROTLI_TRANSFORM_IDENTITY, 24,
49, BROTLI_TRANSFORM_IDENTITY, 25,
49, BROTLI_TRANSFORM_OMIT_LAST_7, 49,
49, BROTLI_TRANSFORM_OMIT_LAST_1, 26,
49, BROTLI_TRANSFORM_IDENTITY, 27,
49, BROTLI_TRANSFORM_IDENTITY, 28,
0, BROTLI_TRANSFORM_IDENTITY, 12,
49, BROTLI_TRANSFORM_IDENTITY, 29,
49, BROTLI_TRANSFORM_OMIT_FIRST_9, 49,
49, BROTLI_TRANSFORM_OMIT_FIRST_7, 49,
49, BROTLI_TRANSFORM_OMIT_LAST_6, 49,
49, BROTLI_TRANSFORM_IDENTITY, 21,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 1,
49, BROTLI_TRANSFORM_OMIT_LAST_8, 49,
49, BROTLI_TRANSFORM_IDENTITY, 31,
49, BROTLI_TRANSFORM_IDENTITY, 32,
47, BROTLI_TRANSFORM_IDENTITY, 3,
49, BROTLI_TRANSFORM_OMIT_LAST_5, 49,
49, BROTLI_TRANSFORM_OMIT_LAST_9, 49,
0, BROTLI_TRANSFORM_UPPERCASE_FIRST, 1,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 8,
5, BROTLI_TRANSFORM_IDENTITY, 21,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 0,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 10,
49, BROTLI_TRANSFORM_IDENTITY, 30,
0, BROTLI_TRANSFORM_IDENTITY, 5,
35, BROTLI_TRANSFORM_IDENTITY, 49,
47, BROTLI_TRANSFORM_IDENTITY, 2,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 17,
49, BROTLI_TRANSFORM_IDENTITY, 36,
49, BROTLI_TRANSFORM_IDENTITY, 33,
5, BROTLI_TRANSFORM_IDENTITY, 0,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 21,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 5,
49, BROTLI_TRANSFORM_IDENTITY, 37,
0, BROTLI_TRANSFORM_IDENTITY, 30,
49, BROTLI_TRANSFORM_IDENTITY, 38,
0, BROTLI_TRANSFORM_UPPERCASE_ALL, 0,
49, BROTLI_TRANSFORM_IDENTITY, 39,
0, BROTLI_TRANSFORM_UPPERCASE_ALL, 49,
49, BROTLI_TRANSFORM_IDENTITY, 34,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 8,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 12,
0, BROTLI_TRANSFORM_IDENTITY, 21,
49, BROTLI_TRANSFORM_IDENTITY, 40,
0, BROTLI_TRANSFORM_UPPERCASE_FIRST, 12,
49, BROTLI_TRANSFORM_IDENTITY, 41,
49, BROTLI_TRANSFORM_IDENTITY, 42,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 17,
49, BROTLI_TRANSFORM_IDENTITY, 43,
0, BROTLI_TRANSFORM_UPPERCASE_FIRST, 5,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 10,
0, BROTLI_TRANSFORM_IDENTITY, 34,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 33,
49, BROTLI_TRANSFORM_IDENTITY, 44,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 5,
45, BROTLI_TRANSFORM_IDENTITY, 49,
0, BROTLI_TRANSFORM_IDENTITY, 33,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 30,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 30,
49, BROTLI_TRANSFORM_IDENTITY, 46,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 1,
49, BROTLI_TRANSFORM_UPPERCASE_FIRST, 34,
0, BROTLI_TRANSFORM_UPPERCASE_FIRST, 33,
0, BROTLI_TRANSFORM_UPPERCASE_ALL, 30,
0, BROTLI_TRANSFORM_UPPERCASE_ALL, 1,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 33,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 21,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 12,
0, BROTLI_TRANSFORM_UPPERCASE_ALL, 5,
49, BROTLI_TRANSFORM_UPPERCASE_ALL, 34,
0, BROTLI_TRANSFORM_UPPERCASE_ALL, 12,
0, BROTLI_TRANSFORM_UPPERCASE_FIRST, 30,
0, BROTLI_TRANSFORM_UPPERCASE_ALL, 34,
0, BROTLI_TRANSFORM_UPPERCASE_FIRST, 34,
};
static const BROTLI_MODEL("small")
BrotliTransforms kBrotliTransforms = {
sizeof(kPrefixSuffix),
(const uint8_t*)kPrefixSuffix,
kPrefixSuffixMap,
sizeof(kTransformsData) / (3 * sizeof(kTransformsData[0])),
kTransformsData,
NULL, /* no extra parameters */
{0, 12, 27, 23, 42, 63, 56, 48, 59, 64}
};
const BrotliTransforms* BrotliGetTransforms(void) {
return &kBrotliTransforms;
}
static int ToUpperCase(uint8_t* p) {
if (p[0] < 0xC0) {
if (p[0] >= 'a' && p[0] <= 'z') {
p[0] ^= 32;
}
return 1;
}
/* An overly simplified uppercasing model for UTF-8. */
if (p[0] < 0xE0) {
p[1] ^= 32;
return 2;
}
/* An arbitrary transform for three byte characters. */
p[2] ^= 5;
return 3;
}
static int Shift(uint8_t* word, int word_len, uint16_t parameter) {
/* Limited sign extension: scalar < (1 << 24). */
uint32_t scalar =
(parameter & 0x7FFFu) + (0x1000000u - (parameter & 0x8000u));
if (word[0] < 0x80) {
/* 1-byte rune / 0sssssss / 7 bit scalar (ASCII). */
scalar += (uint32_t)word[0];
word[0] = (uint8_t)(scalar & 0x7Fu);
return 1;
} else if (word[0] < 0xC0) {
/* Continuation / 10AAAAAA. */
return 1;
} else if (word[0] < 0xE0) {
/* 2-byte rune / 110sssss AAssssss / 11 bit scalar. */
if (word_len < 2) return 1;
scalar += (uint32_t)((word[1] & 0x3Fu) | ((word[0] & 0x1Fu) << 6u));
word[0] = (uint8_t)(0xC0 | ((scalar >> 6u) & 0x1F));
word[1] = (uint8_t)((word[1] & 0xC0) | (scalar & 0x3F));
return 2;
} else if (word[0] < 0xF0) {
/* 3-byte rune / 1110ssss AAssssss BBssssss / 16 bit scalar. */
if (word_len < 3) return word_len;
scalar += (uint32_t)((word[2] & 0x3Fu) | ((word[1] & 0x3Fu) << 6u) |
((word[0] & 0x0Fu) << 12u));
word[0] = (uint8_t)(0xE0 | ((scalar >> 12u) & 0x0F));
word[1] = (uint8_t)((word[1] & 0xC0) | ((scalar >> 6u) & 0x3F));
word[2] = (uint8_t)((word[2] & 0xC0) | (scalar & 0x3F));
return 3;
} else if (word[0] < 0xF8) {
/* 4-byte rune / 11110sss AAssssss BBssssss CCssssss / 21 bit scalar. */
if (word_len < 4) return word_len;
scalar += (uint32_t)((word[3] & 0x3Fu) | ((word[2] & 0x3Fu) << 6u) |
((word[1] & 0x3Fu) << 12u) | ((word[0] & 0x07u) << 18u));
word[0] = (uint8_t)(0xF0 | ((scalar >> 18u) & 0x07));
word[1] = (uint8_t)((word[1] & 0xC0) | ((scalar >> 12u) & 0x3F));
word[2] = (uint8_t)((word[2] & 0xC0) | ((scalar >> 6u) & 0x3F));
word[3] = (uint8_t)((word[3] & 0xC0) | (scalar & 0x3F));
return 4;
}
return 1;
}
int BrotliTransformDictionaryWord(uint8_t* dst, const uint8_t* word, int len,
const BrotliTransforms* transforms, int transform_idx) {
int idx = 0;
const uint8_t* prefix = BROTLI_TRANSFORM_PREFIX(transforms, transform_idx);
uint8_t type = BROTLI_TRANSFORM_TYPE(transforms, transform_idx);
const uint8_t* suffix = BROTLI_TRANSFORM_SUFFIX(transforms, transform_idx);
{
int prefix_len = *prefix++;
while (prefix_len--) { dst[idx++] = *prefix++; }
}
{
const int t = type;
int i = 0;
if (t <= BROTLI_TRANSFORM_OMIT_LAST_9) {
len -= t;
} else if (t >= BROTLI_TRANSFORM_OMIT_FIRST_1
&& t <= BROTLI_TRANSFORM_OMIT_FIRST_9) {
int skip = t - (BROTLI_TRANSFORM_OMIT_FIRST_1 - 1);
word += skip;
len -= skip;
}
while (i < len) { dst[idx++] = word[i++]; }
if (t == BROTLI_TRANSFORM_UPPERCASE_FIRST) {
ToUpperCase(&dst[idx - len]);
} else if (t == BROTLI_TRANSFORM_UPPERCASE_ALL) {
uint8_t* uppercase = &dst[idx - len];
while (len > 0) {
int step = ToUpperCase(uppercase);
uppercase += step;
len -= step;
}
} else if (t == BROTLI_TRANSFORM_SHIFT_FIRST) {
uint16_t param = (uint16_t)(transforms->params[transform_idx * 2]
+ (transforms->params[transform_idx * 2 + 1] << 8u));
Shift(&dst[idx - len], len, param);
} else if (t == BROTLI_TRANSFORM_SHIFT_ALL) {
uint16_t param = (uint16_t)(transforms->params[transform_idx * 2]
+ (transforms->params[transform_idx * 2 + 1] << 8u));
uint8_t* shift = &dst[idx - len];
while (len > 0) {
int step = Shift(shift, len, param);
shift += step;
len -= step;
}
}
}
{
int suffix_len = *suffix++;
while (suffix_len--) { dst[idx++] = *suffix++; }
return idx;
}
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

84
c/common/transform.h Normal file
View File

@@ -0,0 +1,84 @@
/* transforms is a part of ABI, but not API.
It means that there are some functions that are supposed to be in "common"
library, but header itself is not placed into include/brotli. This way,
aforementioned functions will be available only to brotli internals.
*/
#ifndef BROTLI_COMMON_TRANSFORM_H_
#define BROTLI_COMMON_TRANSFORM_H_
#include "platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
enum BrotliWordTransformType {
BROTLI_TRANSFORM_IDENTITY = 0,
BROTLI_TRANSFORM_OMIT_LAST_1 = 1,
BROTLI_TRANSFORM_OMIT_LAST_2 = 2,
BROTLI_TRANSFORM_OMIT_LAST_3 = 3,
BROTLI_TRANSFORM_OMIT_LAST_4 = 4,
BROTLI_TRANSFORM_OMIT_LAST_5 = 5,
BROTLI_TRANSFORM_OMIT_LAST_6 = 6,
BROTLI_TRANSFORM_OMIT_LAST_7 = 7,
BROTLI_TRANSFORM_OMIT_LAST_8 = 8,
BROTLI_TRANSFORM_OMIT_LAST_9 = 9,
BROTLI_TRANSFORM_UPPERCASE_FIRST = 10,
BROTLI_TRANSFORM_UPPERCASE_ALL = 11,
BROTLI_TRANSFORM_OMIT_FIRST_1 = 12,
BROTLI_TRANSFORM_OMIT_FIRST_2 = 13,
BROTLI_TRANSFORM_OMIT_FIRST_3 = 14,
BROTLI_TRANSFORM_OMIT_FIRST_4 = 15,
BROTLI_TRANSFORM_OMIT_FIRST_5 = 16,
BROTLI_TRANSFORM_OMIT_FIRST_6 = 17,
BROTLI_TRANSFORM_OMIT_FIRST_7 = 18,
BROTLI_TRANSFORM_OMIT_FIRST_8 = 19,
BROTLI_TRANSFORM_OMIT_FIRST_9 = 20,
BROTLI_TRANSFORM_SHIFT_FIRST = 21,
BROTLI_TRANSFORM_SHIFT_ALL = 22,
BROTLI_NUM_TRANSFORM_TYPES /* Counts transforms, not a transform itself. */
};
#define BROTLI_TRANSFORMS_MAX_CUT_OFF BROTLI_TRANSFORM_OMIT_LAST_9
typedef struct BrotliTransforms {
uint16_t prefix_suffix_size;
/* Last character must be null, so prefix_suffix_size must be at least 1. */
const uint8_t* prefix_suffix;
const uint16_t* prefix_suffix_map;
uint32_t num_transforms;
/* Each entry is a [prefix_id, transform, suffix_id] triplet. */
const uint8_t* transforms;
/* Shift for BROTLI_TRANSFORM_SHIFT_FIRST and BROTLI_TRANSFORM_SHIFT_ALL,
must be NULL if and only if no such transforms are present. */
const uint8_t* params;
/* Indices of transforms like ["", BROTLI_TRANSFORM_OMIT_LAST_#, ""].
0-th element corresponds to ["", BROTLI_TRANSFORM_IDENTITY, ""].
-1, if cut-off transform does not exist. */
int16_t cutOffTransforms[BROTLI_TRANSFORMS_MAX_CUT_OFF + 1];
} BrotliTransforms;
/* T is BrotliTransforms*; result is uint8_t. */
#define BROTLI_TRANSFORM_PREFIX_ID(T, I) ((T)->transforms[((I) * 3) + 0])
#define BROTLI_TRANSFORM_TYPE(T, I) ((T)->transforms[((I) * 3) + 1])
#define BROTLI_TRANSFORM_SUFFIX_ID(T, I) ((T)->transforms[((I) * 3) + 2])
/* T is BrotliTransforms*; result is const uint8_t*. */
#define BROTLI_TRANSFORM_PREFIX(T, I) (&(T)->prefix_suffix[ \
(T)->prefix_suffix_map[BROTLI_TRANSFORM_PREFIX_ID(T, I)]])
#define BROTLI_TRANSFORM_SUFFIX(T, I) (&(T)->prefix_suffix[ \
(T)->prefix_suffix_map[BROTLI_TRANSFORM_SUFFIX_ID(T, I)]])
BROTLI_COMMON_API const BrotliTransforms* BrotliGetTransforms(void);
BROTLI_COMMON_API int BrotliTransformDictionaryWord(
uint8_t* dst, const uint8_t* word, int len,
const BrotliTransforms* transforms, int transform_idx);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_COMMON_TRANSFORM_H_ */

51
c/common/version.h Normal file
View File

@@ -0,0 +1,51 @@
/* Copyright 2016 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Version definition. */
#ifndef BROTLI_COMMON_VERSION_H_
#define BROTLI_COMMON_VERSION_H_
/* Compose 3 components into a single number. In a hexadecimal representation
B and C components occupy exactly 3 digits. */
#define BROTLI_MAKE_HEX_VERSION(A, B, C) ((A << 24) | (B << 12) | C)
/* Those macros should only be used when library is compiled together with
the client. If library is dynamically linked, use BrotliDecoderVersion and
BrotliEncoderVersion methods. */
#define BROTLI_VERSION_MAJOR 1
#define BROTLI_VERSION_MINOR 2
#define BROTLI_VERSION_PATCH 0
#define BROTLI_VERSION BROTLI_MAKE_HEX_VERSION( \
BROTLI_VERSION_MAJOR, BROTLI_VERSION_MINOR, BROTLI_VERSION_PATCH)
/* This macro is used by build system to produce Libtool-friendly soname. See
https://www.gnu.org/software/libtool/manual/html_node/Libtool-versioning.html
Version evolution rules:
- interfaces added (or change is compatible) -> current+1:0:age+1
- interfaces removed (or changed is incompatible) -> current+1:0:0
- interfaces not changed -> current:revision+1:age
*/
#define BROTLI_ABI_CURRENT 3
#define BROTLI_ABI_REVISION 0
#define BROTLI_ABI_AGE 2
#if BROTLI_VERSION_MAJOR != (BROTLI_ABI_CURRENT - BROTLI_ABI_AGE)
#error ABI/API version inconsistency
#endif
#if BROTLI_VERSION_MINOR != BROTLI_ABI_AGE
#error ABI/API version inconsistency
#endif
#if BROTLI_VERSION_PATCH != BROTLI_ABI_REVISION
#error ABI/API version inconsistency
#endif
#endif /* BROTLI_COMMON_VERSION_H_ */

77
c/dec/bit_reader.c Normal file
View File

@@ -0,0 +1,77 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Bit reading helpers */
#include "bit_reader.h"
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
const BROTLI_MODEL("small")
brotli_reg_t kBrotliBitMask[33] = { 0x00000000,
0x00000001, 0x00000003, 0x00000007, 0x0000000F,
0x0000001F, 0x0000003F, 0x0000007F, 0x000000FF,
0x000001FF, 0x000003FF, 0x000007FF, 0x00000FFF,
0x00001FFF, 0x00003FFF, 0x00007FFF, 0x0000FFFF,
0x0001FFFF, 0x0003FFFF, 0x0007FFFF, 0x000FFFFF,
0x001FFFFF, 0x003FFFFF, 0x007FFFFF, 0x00FFFFFF,
0x01FFFFFF, 0x03FFFFFF, 0x07FFFFFF, 0x0FFFFFFF,
0x1FFFFFFF, 0x3FFFFFFF, 0x7FFFFFFF, 0xFFFFFFFF
};
void BrotliInitBitReader(BrotliBitReader* const br) {
br->val_ = 0;
br->bit_pos_ = 0;
}
BROTLI_BOOL BrotliWarmupBitReader(BrotliBitReader* const br) {
size_t aligned_read_mask = (sizeof(br->val_) >> 1) - 1;
/* Fixing alignment after unaligned BrotliFillWindow would result accumulator
overflow. If unalignment is caused by BrotliSafeReadBits, then there is
enough space in accumulator to fix alignment. */
if (BROTLI_UNALIGNED_READ_FAST) {
aligned_read_mask = 0;
}
if (BrotliGetAvailableBits(br) == 0) {
br->val_ = 0;
if (!BrotliPullByte(br)) {
return BROTLI_FALSE;
}
}
while ((((size_t)br->next_in) & aligned_read_mask) != 0) {
if (!BrotliPullByte(br)) {
/* If we consumed all the input, we don't care about the alignment. */
return BROTLI_TRUE;
}
}
return BROTLI_TRUE;
}
BROTLI_BOOL BrotliSafeReadBits32Slow(BrotliBitReader* const br,
brotli_reg_t n_bits, brotli_reg_t* val) {
brotli_reg_t low_val;
brotli_reg_t high_val;
BrotliBitReaderState memento;
BROTLI_DCHECK(n_bits <= 32);
BROTLI_DCHECK(n_bits > 24);
BrotliBitReaderSaveState(br, &memento);
if (!BrotliSafeReadBits(br, 16, &low_val) ||
!BrotliSafeReadBits(br, n_bits - 16, &high_val)) {
BrotliBitReaderRestoreState(br, &memento);
return BROTLI_FALSE;
}
*val = low_val | (high_val << 16);
return BROTLI_TRUE;
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

419
c/dec/bit_reader.h Normal file
View File

@@ -0,0 +1,419 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Bit reading helpers */
#ifndef BROTLI_DEC_BIT_READER_H_
#define BROTLI_DEC_BIT_READER_H_
#include "../common/constants.h"
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#define BROTLI_SHORT_FILL_BIT_WINDOW_READ (sizeof(brotli_reg_t) >> 1)
/* 162 bits + 7 bytes */
#define BROTLI_FAST_INPUT_SLACK 28
BROTLI_INTERNAL extern const brotli_reg_t kBrotliBitMask[33];
static BROTLI_INLINE brotli_reg_t BitMask(brotli_reg_t n) {
if (BROTLI_IS_CONSTANT(n) || BROTLI_HAS_UBFX) {
/* Masking with this expression turns to a single
"Unsigned Bit Field Extract" UBFX instruction on ARM. */
return ~(~((brotli_reg_t)0) << n);
} else {
return kBrotliBitMask[n];
}
}
typedef struct {
brotli_reg_t val_; /* pre-fetched bits */
brotli_reg_t bit_pos_; /* current bit-reading position in val_ */
const uint8_t* next_in; /* the byte we're reading from */
const uint8_t* guard_in; /* position from which "fast-path" is prohibited */
const uint8_t* last_in; /* == next_in + avail_in */
} BrotliBitReader;
typedef struct {
brotli_reg_t val_;
brotli_reg_t bit_pos_;
const uint8_t* next_in;
size_t avail_in;
} BrotliBitReaderState;
/* Initializes the BrotliBitReader fields. */
BROTLI_INTERNAL void BrotliInitBitReader(BrotliBitReader* br);
/* Ensures that accumulator is not empty.
May consume up to sizeof(brotli_reg_t) - 1 bytes of input.
Returns BROTLI_FALSE if data is required but there is no input available.
For !BROTLI_UNALIGNED_READ_FAST this function also prepares bit reader for
aligned reading. */
BROTLI_INTERNAL BROTLI_BOOL BrotliWarmupBitReader(BrotliBitReader* br);
/* Fallback for BrotliSafeReadBits32. Extracted as noninlined method to unburden
the main code-path. Never called for RFC brotli streams, required only for
"large-window" mode and other extensions. */
BROTLI_INTERNAL BROTLI_NOINLINE BROTLI_BOOL BrotliSafeReadBits32Slow(
BrotliBitReader* br, brotli_reg_t n_bits, brotli_reg_t* val);
static BROTLI_INLINE size_t
BrotliBitReaderGetAvailIn(BrotliBitReader* const br) {
return (size_t)(br->last_in - br->next_in);
}
static BROTLI_INLINE void BrotliBitReaderSaveState(
BrotliBitReader* const from, BrotliBitReaderState* to) {
to->val_ = from->val_;
to->bit_pos_ = from->bit_pos_;
to->next_in = from->next_in;
to->avail_in = BrotliBitReaderGetAvailIn(from);
}
static BROTLI_INLINE void BrotliBitReaderSetInput(
BrotliBitReader* const br, const uint8_t* next_in, size_t avail_in) {
br->next_in = next_in;
br->last_in = (avail_in == 0) ? next_in : (next_in + avail_in);
if (avail_in + 1 > BROTLI_FAST_INPUT_SLACK) {
br->guard_in = next_in + (avail_in + 1 - BROTLI_FAST_INPUT_SLACK);
} else {
br->guard_in = next_in;
}
}
static BROTLI_INLINE void BrotliBitReaderRestoreState(
BrotliBitReader* const to, BrotliBitReaderState* from) {
to->val_ = from->val_;
to->bit_pos_ = from->bit_pos_;
to->next_in = from->next_in;
BrotliBitReaderSetInput(to, from->next_in, from->avail_in);
}
static BROTLI_INLINE brotli_reg_t BrotliGetAvailableBits(
const BrotliBitReader* br) {
return br->bit_pos_;
}
/* Returns amount of unread bytes the bit reader still has buffered from the
BrotliInput, including whole bytes in br->val_. Result is capped with
maximal ring-buffer size (larger number won't be utilized anyway). */
static BROTLI_INLINE size_t BrotliGetRemainingBytes(BrotliBitReader* br) {
static const size_t kCap = (size_t)1 << BROTLI_LARGE_MAX_WBITS;
size_t avail_in = BrotliBitReaderGetAvailIn(br);
if (avail_in > kCap) return kCap;
return avail_in + (BrotliGetAvailableBits(br) >> 3);
}
/* Checks if there is at least |num| bytes left in the input ring-buffer
(excluding the bits remaining in br->val_). */
static BROTLI_INLINE BROTLI_BOOL BrotliCheckInputAmount(
BrotliBitReader* const br) {
return TO_BROTLI_BOOL(br->next_in < br->guard_in);
}
/* Load more bits into accumulator. */
static BROTLI_INLINE brotli_reg_t BrotliBitReaderLoadBits(brotli_reg_t val,
brotli_reg_t new_bits,
brotli_reg_t count,
brotli_reg_t offset) {
BROTLI_DCHECK(
!((val >> offset) & ~new_bits & ~(~((brotli_reg_t)0) << count)));
(void)count;
return val | (new_bits << offset);
}
/* Guarantees that there are at least |n_bits| + 1 bits in accumulator.
Precondition: accumulator contains at least 1 bit.
|n_bits| should be in the range [1..24] for regular build. For portable
non-64-bit little-endian build only 16 bits are safe to request. */
static BROTLI_INLINE void BrotliFillBitWindow(
BrotliBitReader* const br, brotli_reg_t n_bits) {
#if (BROTLI_64_BITS)
if (BROTLI_UNALIGNED_READ_FAST && BROTLI_IS_CONSTANT(n_bits) &&
(n_bits <= 8)) {
brotli_reg_t bit_pos = br->bit_pos_;
if (bit_pos <= 8) {
br->val_ = BrotliBitReaderLoadBits(br->val_,
BROTLI_UNALIGNED_LOAD64LE(br->next_in), 56, bit_pos);
br->bit_pos_ = bit_pos + 56;
br->next_in += 7;
}
} else if (BROTLI_UNALIGNED_READ_FAST && BROTLI_IS_CONSTANT(n_bits) &&
(n_bits <= 16)) {
brotli_reg_t bit_pos = br->bit_pos_;
if (bit_pos <= 16) {
br->val_ = BrotliBitReaderLoadBits(br->val_,
BROTLI_UNALIGNED_LOAD64LE(br->next_in), 48, bit_pos);
br->bit_pos_ = bit_pos + 48;
br->next_in += 6;
}
} else {
brotli_reg_t bit_pos = br->bit_pos_;
if (bit_pos <= 32) {
br->val_ = BrotliBitReaderLoadBits(br->val_,
(uint64_t)BROTLI_UNALIGNED_LOAD32LE(br->next_in), 32, bit_pos);
br->bit_pos_ = bit_pos + 32;
br->next_in += BROTLI_SHORT_FILL_BIT_WINDOW_READ;
}
}
#else
if (BROTLI_UNALIGNED_READ_FAST && BROTLI_IS_CONSTANT(n_bits) &&
(n_bits <= 8)) {
brotli_reg_t bit_pos = br->bit_pos_;
if (bit_pos <= 8) {
br->val_ = BrotliBitReaderLoadBits(br->val_,
BROTLI_UNALIGNED_LOAD32LE(br->next_in), 24, bit_pos);
br->bit_pos_ = bit_pos + 24;
br->next_in += 3;
}
} else {
brotli_reg_t bit_pos = br->bit_pos_;
if (bit_pos <= 16) {
br->val_ = BrotliBitReaderLoadBits(br->val_,
(uint32_t)BROTLI_UNALIGNED_LOAD16LE(br->next_in), 16, bit_pos);
br->bit_pos_ = bit_pos + 16;
br->next_in += BROTLI_SHORT_FILL_BIT_WINDOW_READ;
}
}
#endif
}
/* Mostly like BrotliFillBitWindow, but guarantees only 16 bits and reads no
more than BROTLI_SHORT_FILL_BIT_WINDOW_READ bytes of input. */
static BROTLI_INLINE void BrotliFillBitWindow16(BrotliBitReader* const br) {
BrotliFillBitWindow(br, 17);
}
/* Tries to pull one byte of input to accumulator.
Returns BROTLI_FALSE if there is no input available. */
static BROTLI_INLINE BROTLI_BOOL BrotliPullByte(BrotliBitReader* const br) {
if (br->next_in == br->last_in) {
return BROTLI_FALSE;
}
br->val_ = BrotliBitReaderLoadBits(br->val_,
(brotli_reg_t)*br->next_in, 8, br->bit_pos_);
br->bit_pos_ += 8;
++br->next_in;
return BROTLI_TRUE;
}
/* Returns currently available bits.
The number of valid bits could be calculated by BrotliGetAvailableBits. */
static BROTLI_INLINE brotli_reg_t BrotliGetBitsUnmasked(
BrotliBitReader* const br) {
return br->val_;
}
/* Like BrotliGetBits, but does not mask the result.
The result contains at least 16 valid bits. */
static BROTLI_INLINE brotli_reg_t BrotliGet16BitsUnmasked(
BrotliBitReader* const br) {
BrotliFillBitWindow(br, 16);
return (brotli_reg_t)BrotliGetBitsUnmasked(br);
}
/* Returns the specified number of bits from |br| without advancing bit
position. */
static BROTLI_INLINE brotli_reg_t BrotliGetBits(
BrotliBitReader* const br, brotli_reg_t n_bits) {
BrotliFillBitWindow(br, n_bits);
return BrotliGetBitsUnmasked(br) & BitMask(n_bits);
}
/* Tries to peek the specified amount of bits. Returns BROTLI_FALSE, if there
is not enough input. */
static BROTLI_INLINE BROTLI_BOOL BrotliSafeGetBits(
BrotliBitReader* const br, brotli_reg_t n_bits, brotli_reg_t* val) {
while (BrotliGetAvailableBits(br) < n_bits) {
if (!BrotliPullByte(br)) {
return BROTLI_FALSE;
}
}
*val = BrotliGetBitsUnmasked(br) & BitMask(n_bits);
return BROTLI_TRUE;
}
/* Advances the bit pos by |n_bits|. */
static BROTLI_INLINE void BrotliDropBits(
BrotliBitReader* const br, brotli_reg_t n_bits) {
br->bit_pos_ -= n_bits;
br->val_ >>= n_bits;
}
/* Make sure that there are no spectre bits in accumulator.
This is important for the cases when some bytes are skipped
(i.e. never placed into accumulator). */
static BROTLI_INLINE void BrotliBitReaderNormalize(BrotliBitReader* br) {
/* Actually, it is enough to normalize when br->bit_pos_ == 0 */
if (br->bit_pos_ < (sizeof(brotli_reg_t) << 3u)) {
br->val_ &= (((brotli_reg_t)1) << br->bit_pos_) - 1;
}
}
static BROTLI_INLINE void BrotliBitReaderUnload(BrotliBitReader* br) {
brotli_reg_t unused_bytes = BrotliGetAvailableBits(br) >> 3;
brotli_reg_t unused_bits = unused_bytes << 3;
br->next_in =
(unused_bytes == 0) ? br->next_in : (br->next_in - unused_bytes);
br->bit_pos_ -= unused_bits;
BrotliBitReaderNormalize(br);
}
/* Reads the specified number of bits from |br| and advances the bit pos.
Precondition: accumulator MUST contain at least |n_bits|. */
static BROTLI_INLINE void BrotliTakeBits(BrotliBitReader* const br,
brotli_reg_t n_bits,
brotli_reg_t* val) {
*val = BrotliGetBitsUnmasked(br) & BitMask(n_bits);
BROTLI_LOG(("[BrotliTakeBits] %d %d %d val: %6x\n",
(int)BrotliBitReaderGetAvailIn(br), (int)br->bit_pos_,
(int)n_bits, (int)*val));
BrotliDropBits(br, n_bits);
}
/* Reads the specified number of bits from |br| and advances the bit pos.
Assumes that there is enough input to perform BrotliFillBitWindow.
Up to 24 bits are allowed to be requested from this method. */
static BROTLI_INLINE brotli_reg_t BrotliReadBits24(
BrotliBitReader* const br, brotli_reg_t n_bits) {
BROTLI_DCHECK(n_bits <= 24);
if (BROTLI_64_BITS || (n_bits <= 16)) {
brotli_reg_t val;
BrotliFillBitWindow(br, n_bits);
BrotliTakeBits(br, n_bits, &val);
return val;
} else {
brotli_reg_t low_val;
brotli_reg_t high_val;
BrotliFillBitWindow(br, 16);
BrotliTakeBits(br, 16, &low_val);
BrotliFillBitWindow(br, 8);
BrotliTakeBits(br, n_bits - 16, &high_val);
return low_val | (high_val << 16);
}
}
/* Same as BrotliReadBits24, but allows reading up to 32 bits. */
static BROTLI_INLINE brotli_reg_t BrotliReadBits32(
BrotliBitReader* const br, brotli_reg_t n_bits) {
BROTLI_DCHECK(n_bits <= 32);
if (BROTLI_64_BITS || (n_bits <= 16)) {
brotli_reg_t val;
BrotliFillBitWindow(br, n_bits);
BrotliTakeBits(br, n_bits, &val);
return val;
} else {
brotli_reg_t low_val;
brotli_reg_t high_val;
BrotliFillBitWindow(br, 16);
BrotliTakeBits(br, 16, &low_val);
BrotliFillBitWindow(br, 16);
BrotliTakeBits(br, n_bits - 16, &high_val);
return low_val | (high_val << 16);
}
}
/* Tries to read the specified amount of bits. Returns BROTLI_FALSE, if there
is not enough input. |n_bits| MUST be positive.
Up to 24 bits are allowed to be requested from this method. */
static BROTLI_INLINE BROTLI_BOOL BrotliSafeReadBits(
BrotliBitReader* const br, brotli_reg_t n_bits, brotli_reg_t* val) {
BROTLI_DCHECK(n_bits <= 24);
while (BrotliGetAvailableBits(br) < n_bits) {
if (!BrotliPullByte(br)) {
return BROTLI_FALSE;
}
}
BrotliTakeBits(br, n_bits, val);
return BROTLI_TRUE;
}
/* Same as BrotliSafeReadBits, but allows reading up to 32 bits. */
static BROTLI_INLINE BROTLI_BOOL BrotliSafeReadBits32(
BrotliBitReader* const br, brotli_reg_t n_bits, brotli_reg_t* val) {
BROTLI_DCHECK(n_bits <= 32);
if (BROTLI_64_BITS || (n_bits <= 24)) {
while (BrotliGetAvailableBits(br) < n_bits) {
if (!BrotliPullByte(br)) {
return BROTLI_FALSE;
}
}
BrotliTakeBits(br, n_bits, val);
return BROTLI_TRUE;
} else {
return BrotliSafeReadBits32Slow(br, n_bits, val);
}
}
/* Advances the bit reader position to the next byte boundary and verifies
that any skipped bits are set to zero. */
static BROTLI_INLINE BROTLI_BOOL BrotliJumpToByteBoundary(BrotliBitReader* br) {
brotli_reg_t pad_bits_count = BrotliGetAvailableBits(br) & 0x7;
brotli_reg_t pad_bits = 0;
if (pad_bits_count != 0) {
BrotliTakeBits(br, pad_bits_count, &pad_bits);
}
BrotliBitReaderNormalize(br);
return TO_BROTLI_BOOL(pad_bits == 0);
}
static BROTLI_INLINE void BrotliDropBytes(BrotliBitReader* br, size_t num) {
/* Check detour is legal: accumulator must to be empty. */
BROTLI_DCHECK(br->bit_pos_ == 0);
BROTLI_DCHECK(br->val_ == 0);
br->next_in += num;
}
/* Copies remaining input bytes stored in the bit reader to the output. Value
|num| may not be larger than BrotliGetRemainingBytes. The bit reader must be
warmed up again after this. */
static BROTLI_INLINE void BrotliCopyBytes(uint8_t* dest,
BrotliBitReader* br, size_t num) {
while (BrotliGetAvailableBits(br) >= 8 && num > 0) {
*dest = (uint8_t)BrotliGetBitsUnmasked(br);
BrotliDropBits(br, 8);
++dest;
--num;
}
BrotliBitReaderNormalize(br);
if (num > 0) {
memcpy(dest, br->next_in, num);
BrotliDropBytes(br, num);
}
}
BROTLI_UNUSED_FUNCTION void BrotliBitReaderSuppressUnusedFunctions(void) {
BROTLI_UNUSED(&BrotliBitReaderSuppressUnusedFunctions);
BROTLI_UNUSED(&BrotliBitReaderGetAvailIn);
BROTLI_UNUSED(&BrotliBitReaderLoadBits);
BROTLI_UNUSED(&BrotliBitReaderRestoreState);
BROTLI_UNUSED(&BrotliBitReaderSaveState);
BROTLI_UNUSED(&BrotliBitReaderSetInput);
BROTLI_UNUSED(&BrotliBitReaderUnload);
BROTLI_UNUSED(&BrotliCheckInputAmount);
BROTLI_UNUSED(&BrotliCopyBytes);
BROTLI_UNUSED(&BrotliFillBitWindow16);
BROTLI_UNUSED(&BrotliGet16BitsUnmasked);
BROTLI_UNUSED(&BrotliGetBits);
BROTLI_UNUSED(&BrotliGetRemainingBytes);
BROTLI_UNUSED(&BrotliJumpToByteBoundary);
BROTLI_UNUSED(&BrotliReadBits24);
BROTLI_UNUSED(&BrotliReadBits32);
BROTLI_UNUSED(&BrotliSafeGetBits);
BROTLI_UNUSED(&BrotliSafeReadBits);
BROTLI_UNUSED(&BrotliSafeReadBits32);
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_DEC_BIT_READER_H_ */

2964
c/dec/decode.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -6,12 +6,10 @@
/* Utilities for building Huffman decoding tables. */
#include "./huffman.h"
#include "huffman.h"
#include <string.h> /* memcpy, memset */
#include "./port.h"
#include "./types.h"
#include "../common/constants.h"
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
@@ -19,11 +17,13 @@ extern "C" {
#define BROTLI_REVERSE_BITS_MAX 8
#ifdef BROTLI_RBIT
#define BROTLI_REVERSE_BITS_BASE (32 - BROTLI_REVERSE_BITS_MAX)
#if defined(BROTLI_RBIT)
#define BROTLI_REVERSE_BITS_BASE \
((sizeof(brotli_reg_t) << 3) - BROTLI_REVERSE_BITS_MAX)
#else
#define BROTLI_REVERSE_BITS_BASE 0
static uint8_t kReverseBits[1 << BROTLI_REVERSE_BITS_MAX] = {
static BROTLI_MODEL("small")
uint8_t kReverseBits[1 << BROTLI_REVERSE_BITS_MAX] = {
0x00, 0x80, 0x40, 0xC0, 0x20, 0xA0, 0x60, 0xE0,
0x10, 0x90, 0x50, 0xD0, 0x30, 0xB0, 0x70, 0xF0,
0x08, 0x88, 0x48, 0xC8, 0x28, 0xA8, 0x68, 0xE8,
@@ -60,13 +60,13 @@ static uint8_t kReverseBits[1 << BROTLI_REVERSE_BITS_MAX] = {
#endif /* BROTLI_RBIT */
#define BROTLI_REVERSE_BITS_LOWEST \
(1U << (BROTLI_REVERSE_BITS_MAX - 1 + BROTLI_REVERSE_BITS_BASE))
((brotli_reg_t)1 << (BROTLI_REVERSE_BITS_MAX - 1 + BROTLI_REVERSE_BITS_BASE))
/* Returns reverse(num >> BROTLI_REVERSE_BITS_BASE, BROTLI_REVERSE_BITS_MAX),
where reverse(value, len) is the bit-wise reversal of the len least
significant bits of value. */
static BROTLI_INLINE uint32_t BrotliReverseBits(uint32_t num) {
#ifdef BROTLI_RBIT
static BROTLI_INLINE brotli_reg_t BrotliReverseBits(brotli_reg_t num) {
#if defined(BROTLI_RBIT)
return BROTLI_RBIT(num);
#else
return kReverseBits[num];
@@ -84,9 +84,9 @@ static BROTLI_INLINE void ReplicateValue(HuffmanCode* table,
} while (end > 0);
}
/* Returns the table width of the next 2nd level table. count is the histogram
of bit lengths for the remaining symbols, len is the code length of the next
processed symbol */
/* Returns the table width of the next 2nd level table. |count| is the histogram
of bit lengths for the remaining symbols, |len| is the code length of the
next processed symbol. */
static BROTLI_INLINE int NextTableBitSize(const uint16_t* const count,
int len, int root_bits) {
int left = 1 << (len - root_bits);
@@ -102,35 +102,37 @@ static BROTLI_INLINE int NextTableBitSize(const uint16_t* const count,
void BrotliBuildCodeLengthsHuffmanTable(HuffmanCode* table,
const uint8_t* const code_lengths,
uint16_t* count) {
HuffmanCode code; /* current table entry */
int symbol; /* symbol index in original or sorted table */
uint32_t key; /* prefix code */
uint32_t key_step; /* prefix code addend */
int step; /* step size to replicate values in current table */
int table_size; /* size of current table */
int sorted[18]; /* symbols sorted by code length */
HuffmanCode code; /* current table entry */
int symbol; /* symbol index in original or sorted table */
brotli_reg_t key; /* prefix code */
brotli_reg_t key_step; /* prefix code addend */
int step; /* step size to replicate values in current table */
int table_size; /* size of current table */
int sorted[BROTLI_CODE_LENGTH_CODES]; /* symbols sorted by code length */
/* offsets in sorted table for each length */
int offset[BROTLI_HUFFMAN_MAX_CODE_LENGTH_CODE_LENGTH + 1];
int bits;
int bits_count;
BROTLI_DCHECK(BROTLI_HUFFMAN_MAX_CODE_LENGTH_CODE_LENGTH <=
BROTLI_REVERSE_BITS_MAX);
BROTLI_DCHECK(BROTLI_HUFFMAN_MAX_CODE_LENGTH_CODE_LENGTH == 5);
/* generate offsets into sorted symbol table by code length */
/* Generate offsets into sorted symbol table by code length. */
symbol = -1;
bits = 1;
BROTLI_REPEAT(BROTLI_HUFFMAN_MAX_CODE_LENGTH_CODE_LENGTH, {
/* BROTLI_HUFFMAN_MAX_CODE_LENGTH_CODE_LENGTH == 5 */
BROTLI_REPEAT_5({
symbol += count[bits];
offset[bits] = symbol;
bits++;
});
/* Symbols with code length 0 are placed after all other symbols. */
offset[0] = 17;
offset[0] = BROTLI_CODE_LENGTH_CODES - 1;
/* sort symbols by length, by symbol order within each length */
symbol = 18;
/* Sort symbols by length, by symbol order within each length. */
symbol = BROTLI_CODE_LENGTH_CODES;
do {
BROTLI_REPEAT(6, {
BROTLI_REPEAT_6({
symbol--;
sorted[offset[code_lengths[symbol]]--] = symbol;
});
@@ -140,24 +142,22 @@ void BrotliBuildCodeLengthsHuffmanTable(HuffmanCode* table,
/* Special case: all symbols but one have 0 code length. */
if (offset[0] == 0) {
code.bits = 0;
code.value = (uint16_t)sorted[0];
for (key = 0; key < (uint32_t)table_size; ++key) {
code = ConstructHuffmanCode(0, (uint16_t)sorted[0]);
for (key = 0; key < (brotli_reg_t)table_size; ++key) {
table[key] = code;
}
return;
}
/* fill in table */
/* Fill in table. */
key = 0;
key_step = BROTLI_REVERSE_BITS_LOWEST;
symbol = 0;
bits = 1;
step = 2;
do {
code.bits = (uint8_t)bits;
for (bits_count = count[bits]; bits_count != 0; --bits_count) {
code.value = (uint16_t)sorted[symbol++];
code = ConstructHuffmanCode((uint8_t)bits, (uint16_t)sorted[symbol++]);
ReplicateValue(&table[BrotliReverseBits(key)], step, table_size, code);
key += key_step;
}
@@ -174,10 +174,10 @@ uint32_t BrotliBuildHuffmanTable(HuffmanCode* root_table,
HuffmanCode* table; /* next available space in table */
int len; /* current code length */
int symbol; /* symbol index in original or sorted table */
uint32_t key; /* prefix code */
uint32_t key_step; /* prefix code addend */
uint32_t sub_key; /* 2nd level table prefix code */
uint32_t sub_key_step; /* 2nd level table prefix code addend */
brotli_reg_t key; /* prefix code */
brotli_reg_t key_step; /* prefix code addend */
brotli_reg_t sub_key; /* 2nd level table prefix code */
brotli_reg_t sub_key_step; /* 2nd level table prefix code addend */
int step; /* step size to replicate values in current table */
int table_bits; /* key length of current table */
int table_size; /* size of current table */
@@ -198,9 +198,8 @@ uint32_t BrotliBuildHuffmanTable(HuffmanCode* root_table,
table_size = 1 << table_bits;
total_size = table_size;
/* fill in root table */
/* let's reduce the table size to a smaller size if possible, and */
/* create the repetitions by memcpy if possible in the coming loop */
/* Fill in the root table. Reduce the table size to if possible,
and create the repetitions by memcpy. */
if (table_bits > max_length) {
table_bits = max_length;
table_size = 1 << table_bits;
@@ -210,11 +209,10 @@ uint32_t BrotliBuildHuffmanTable(HuffmanCode* root_table,
bits = 1;
step = 2;
do {
code.bits = (uint8_t)bits;
symbol = bits - (BROTLI_HUFFMAN_MAX_CODE_LENGTH + 1);
for (bits_count = count[bits]; bits_count != 0; --bits_count) {
symbol = symbol_lists[symbol];
code.value = (uint16_t)symbol;
code = ConstructHuffmanCode((uint8_t)bits, (uint16_t)symbol);
ReplicateValue(&table[BrotliReverseBits(key)], step, table_size, code);
key += key_step;
}
@@ -222,15 +220,14 @@ uint32_t BrotliBuildHuffmanTable(HuffmanCode* root_table,
key_step >>= 1;
} while (++bits <= table_bits);
/* if root_bits != table_bits we only created one fraction of the */
/* table, and we need to replicate it now. */
/* If root_bits != table_bits then replicate to fill the remaining slots. */
while (total_size != table_size) {
memcpy(&table[table_size], &table[0],
(size_t)table_size * sizeof(table[0]));
table_size <<= 1;
}
/* fill in 2nd level tables and add pointers to root table */
/* Fill in 2nd level tables and add pointers to root table. */
key_step = BROTLI_REVERSE_BITS_LOWEST >> (root_bits - 1);
sub_key = (BROTLI_REVERSE_BITS_LOWEST << 1);
sub_key_step = BROTLI_REVERSE_BITS_LOWEST;
@@ -244,14 +241,13 @@ uint32_t BrotliBuildHuffmanTable(HuffmanCode* root_table,
total_size += table_size;
sub_key = BrotliReverseBits(key);
key += key_step;
root_table[sub_key].bits = (uint8_t)(table_bits + root_bits);
root_table[sub_key].value =
(uint16_t)(((size_t)(table - root_table)) - sub_key);
root_table[sub_key] = ConstructHuffmanCode(
(uint8_t)(table_bits + root_bits),
(uint16_t)(((size_t)(table - root_table)) - sub_key));
sub_key = 0;
}
code.bits = (uint8_t)(len - root_bits);
symbol = symbol_lists[symbol];
code.value = (uint16_t)symbol;
code = ConstructHuffmanCode((uint8_t)(len - root_bits), (uint16_t)symbol);
ReplicateValue(
&table[BrotliReverseBits(sub_key)], step, table_size, code);
sub_key += sub_key_step;
@@ -270,35 +266,28 @@ uint32_t BrotliBuildSimpleHuffmanTable(HuffmanCode* table,
const uint32_t goal_size = 1U << root_bits;
switch (num_symbols) {
case 0:
table[0].bits = 0;
table[0].value = val[0];
table[0] = ConstructHuffmanCode(0, val[0]);
break;
case 1:
table[0].bits = 1;
table[1].bits = 1;
if (val[1] > val[0]) {
table[0].value = val[0];
table[1].value = val[1];
table[0] = ConstructHuffmanCode(1, val[0]);
table[1] = ConstructHuffmanCode(1, val[1]);
} else {
table[0].value = val[1];
table[1].value = val[0];
table[0] = ConstructHuffmanCode(1, val[1]);
table[1] = ConstructHuffmanCode(1, val[0]);
}
table_size = 2;
break;
case 2:
table[0].bits = 1;
table[0].value = val[0];
table[2].bits = 1;
table[2].value = val[0];
table[0] = ConstructHuffmanCode(1, val[0]);
table[2] = ConstructHuffmanCode(1, val[0]);
if (val[2] > val[1]) {
table[1].value = val[1];
table[3].value = val[2];
table[1] = ConstructHuffmanCode(2, val[1]);
table[3] = ConstructHuffmanCode(2, val[2]);
} else {
table[1].value = val[2];
table[3].value = val[1];
table[1] = ConstructHuffmanCode(2, val[2]);
table[3] = ConstructHuffmanCode(2, val[1]);
}
table[1].bits = 2;
table[3].bits = 2;
table_size = 4;
break;
case 3: {
@@ -312,33 +301,27 @@ uint32_t BrotliBuildSimpleHuffmanTable(HuffmanCode* table,
}
}
}
for (i = 0; i < 4; ++i) {
table[i].bits = 2;
}
table[0].value = val[0];
table[2].value = val[1];
table[1].value = val[2];
table[3].value = val[3];
table[0] = ConstructHuffmanCode(2, val[0]);
table[2] = ConstructHuffmanCode(2, val[1]);
table[1] = ConstructHuffmanCode(2, val[2]);
table[3] = ConstructHuffmanCode(2, val[3]);
table_size = 4;
break;
}
case 4: {
int i;
if (val[3] < val[2]) {
uint16_t t = val[3];
val[3] = val[2];
val[2] = t;
}
for (i = 0; i < 7; ++i) {
table[i].value = val[0];
table[i].bits = (uint8_t)(1 + (i & 1));
}
table[1].value = val[1];
table[3].value = val[2];
table[5].value = val[1];
table[7].value = val[3];
table[3].bits = 3;
table[7].bits = 3;
table[0] = ConstructHuffmanCode(1, val[0]);
table[1] = ConstructHuffmanCode(2, val[1]);
table[2] = ConstructHuffmanCode(1, val[0]);
table[3] = ConstructHuffmanCode(3, val[2]);
table[4] = ConstructHuffmanCode(1, val[0]);
table[5] = ConstructHuffmanCode(2, val[1]);
table[6] = ConstructHuffmanCode(1, val[0]);
table[7] = ConstructHuffmanCode(3, val[3]);
table_size = 8;
break;
}

120
c/dec/huffman.h Normal file
View File

@@ -0,0 +1,120 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Utilities for building Huffman decoding tables. */
#ifndef BROTLI_DEC_HUFFMAN_H_
#define BROTLI_DEC_HUFFMAN_H_
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#define BROTLI_HUFFMAN_MAX_CODE_LENGTH 15
/* BROTLI_NUM_BLOCK_LEN_SYMBOLS == 26 */
#define BROTLI_HUFFMAN_MAX_SIZE_26 396
/* BROTLI_MAX_BLOCK_TYPE_SYMBOLS == 258 */
#define BROTLI_HUFFMAN_MAX_SIZE_258 632
/* BROTLI_MAX_CONTEXT_MAP_SYMBOLS == 272 */
#define BROTLI_HUFFMAN_MAX_SIZE_272 646
#define BROTLI_HUFFMAN_MAX_CODE_LENGTH_CODE_LENGTH 5
#if ((defined(BROTLI_TARGET_ARMV7) || defined(BROTLI_TARGET_ARMV8_32)) && \
BROTLI_GNUC_HAS_ATTRIBUTE(aligned, 2, 7, 0))
#define BROTLI_HUFFMAN_CODE_FAST_LOAD
#endif
#if !defined(BROTLI_HUFFMAN_CODE_FAST_LOAD)
/* Do not create this struct directly - use the ConstructHuffmanCode
* constructor below! */
typedef struct {
uint8_t bits; /* number of bits used for this symbol */
uint16_t value; /* symbol value or table offset */
} HuffmanCode;
static BROTLI_INLINE HuffmanCode ConstructHuffmanCode(const uint8_t bits,
const uint16_t value) {
HuffmanCode h;
h.bits = bits;
h.value = value;
return h;
}
/* Please use the following macros to optimize HuffmanCode accesses in hot
* paths.
*
* For example, assuming |table| contains a HuffmanCode pointer:
*
* BROTLI_HC_MARK_TABLE_FOR_FAST_LOAD(table);
* BROTLI_HC_ADJUST_TABLE_INDEX(table, index_into_table);
* *bits = BROTLI_HC_GET_BITS(table);
* *value = BROTLI_HC_GET_VALUE(table);
* BROTLI_HC_ADJUST_TABLE_INDEX(table, offset);
* *bits2 = BROTLI_HC_GET_BITS(table);
* *value2 = BROTLI_HC_GET_VALUE(table);
*
*/
#define BROTLI_HC_MARK_TABLE_FOR_FAST_LOAD(H)
#define BROTLI_HC_ADJUST_TABLE_INDEX(H, V) H += (V)
/* These must be given a HuffmanCode pointer! */
#define BROTLI_HC_FAST_LOAD_BITS(H) (H->bits)
#define BROTLI_HC_FAST_LOAD_VALUE(H) (H->value)
#else /* BROTLI_HUFFMAN_CODE_FAST_LOAD */
typedef BROTLI_ALIGNED(4) uint32_t HuffmanCode;
static BROTLI_INLINE HuffmanCode ConstructHuffmanCode(const uint8_t bits,
const uint16_t value) {
return (HuffmanCode) ((value & 0xFFFF) << 16) | (bits & 0xFF);
}
#define BROTLI_HC_MARK_TABLE_FOR_FAST_LOAD(H) uint32_t __fastload_##H = (*H)
#define BROTLI_HC_ADJUST_TABLE_INDEX(H, V) H += (V); __fastload_##H = (*H)
/* These must be given a HuffmanCode pointer! */
#define BROTLI_HC_FAST_LOAD_BITS(H) ((__fastload_##H) & 0xFF)
#define BROTLI_HC_FAST_LOAD_VALUE(H) ((__fastload_##H) >> 16)
#endif /* BROTLI_HUFFMAN_CODE_FAST_LOAD */
/* Builds Huffman lookup table assuming code lengths are in symbol order. */
BROTLI_INTERNAL void BrotliBuildCodeLengthsHuffmanTable(HuffmanCode* root_table,
const uint8_t* const code_lengths, uint16_t* count);
/* Builds Huffman lookup table assuming code lengths are in symbol order.
Returns size of resulting table. */
BROTLI_INTERNAL uint32_t BrotliBuildHuffmanTable(HuffmanCode* root_table,
int root_bits, const uint16_t* const symbol_lists, uint16_t* count);
/* Builds a simple Huffman table. The |num_symbols| parameter is to be
interpreted as follows: 0 means 1 symbol, 1 means 2 symbols,
2 means 3 symbols, 3 means 4 symbols with lengths [2, 2, 2, 2],
4 means 4 symbols with lengths [1, 2, 3, 3]. */
BROTLI_INTERNAL uint32_t BrotliBuildSimpleHuffmanTable(HuffmanCode* table,
int root_bits, uint16_t* symbols, uint32_t num_symbols);
/* Contains a collection of Huffman trees with the same alphabet size. */
/* alphabet_size_limit is needed due to simple codes, since
log2(alphabet_size_max) could be greater than log2(alphabet_size_limit). */
typedef struct {
HuffmanCode** htrees;
HuffmanCode* codes;
uint16_t alphabet_size_max;
uint16_t alphabet_size_limit;
uint16_t num_htrees;
} HuffmanTreeGroup;
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_DEC_HUFFMAN_H_ */

67
c/dec/prefix.c Normal file
View File

@@ -0,0 +1,67 @@
/* Copyright 2025 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "prefix.h"
#include "../common/platform.h" /* IWYU pragma: keep */
#include "../common/static_init.h"
#if (BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_NONE)
#include "../common/constants.h"
#endif
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#if (BROTLI_STATIC_INIT == BROTLI_STATIC_INIT_NONE)
/* Embed kCmdLut. */
#include "prefix_inc.h"
#else
BROTLI_COLD BROTLI_BOOL BrotliDecoderInitCmdLut(CmdLutElement* items) {
static const uint8_t kInsertLengthExtraBits[24] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x01, 0x02, 0x02, 0x03, 0x03,
0x04, 0x04, 0x05, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0C, 0x0E, 0x18};
static const uint8_t kCopyLengthExtraBits[24] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x01, 0x02, 0x02,
0x03, 0x03, 0x04, 0x04, 0x05, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x18};
static const uint8_t kCellPos[11] = {0, 1, 0, 1, 8, 9, 2, 16, 10, 17, 18};
uint16_t insert_length_offsets[24];
uint16_t copy_length_offsets[24];
insert_length_offsets[0] = 0;
copy_length_offsets[0] = 2;
for (size_t i = 0; i < 23; ++i) {
insert_length_offsets[i + 1] =
insert_length_offsets[i] + (uint16_t)(1u << kInsertLengthExtraBits[i]);
copy_length_offsets[i + 1] =
copy_length_offsets[i] + (uint16_t)(1u << kCopyLengthExtraBits[i]);
}
for (size_t symbol = 0; symbol < BROTLI_NUM_COMMAND_SYMBOLS; ++symbol) {
CmdLutElement* item = items + symbol;
const size_t cell_idx = symbol >> 6;
const size_t cell_pos = kCellPos[cell_idx];
const size_t copy_code = ((cell_pos << 3) & 0x18) + (symbol & 0x7);
const uint16_t copy_len_offset = copy_length_offsets[copy_code];
const size_t insert_code = (cell_pos & 0x18) + ((symbol >> 3) & 0x7);
item->copy_len_extra_bits = kCopyLengthExtraBits[copy_code];
item->context = (copy_len_offset > 4) ? 3 : ((uint8_t)copy_len_offset - 2);
item->copy_len_offset = copy_len_offset;
item->distance_code = (cell_idx >= 2) ? -1 : 0;
item->insert_len_extra_bits = kInsertLengthExtraBits[insert_code];
item->insert_len_offset = insert_length_offsets[insert_code];
}
return BROTLI_TRUE;
}
BROTLI_MODEL("small")
CmdLutElement kCmdLut[BROTLI_NUM_COMMAND_SYMBOLS];
#endif /* BROTLI_STATIC_INIT */
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

43
c/dec/prefix.h Normal file
View File

@@ -0,0 +1,43 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Lookup tables to map prefix codes to value ranges. This is used during
decoding of the block lengths, literal insertion lengths and copy lengths. */
#ifndef BROTLI_DEC_PREFIX_H_
#define BROTLI_DEC_PREFIX_H_
#include "../common/constants.h"
#include "../common/platform.h" /* IWYU pragma: keep */
#include "../common/static_init.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
typedef struct CmdLutElement {
uint8_t insert_len_extra_bits;
uint8_t copy_len_extra_bits;
int8_t distance_code;
uint8_t context;
uint16_t insert_len_offset;
uint16_t copy_len_offset;
} CmdLutElement;
#if (BROTLI_STATIC_INIT == BROTLI_STATIC_INIT_NONE)
BROTLI_INTERNAL extern const BROTLI_MODEL("small")
CmdLutElement kCmdLut[BROTLI_NUM_COMMAND_SYMBOLS];
#else
BROTLI_INTERNAL BROTLI_BOOL BrotliDecoderInitCmdLut(CmdLutElement* items);
BROTLI_INTERNAL extern BROTLI_MODEL("small")
CmdLutElement kCmdLut[BROTLI_NUM_COMMAND_SYMBOLS];
#endif
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_DEC_PREFIX_H_ */

View File

@@ -1,45 +1,5 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Lookup tables to map prefix codes to value ranges. This is used during
decoding of the block lengths, literal insertion lengths and copy lengths.
*/
#ifndef BROTLI_DEC_PREFIX_H_
#define BROTLI_DEC_PREFIX_H_
#include "./types.h"
/* Represents the range of values belonging to a prefix code: */
/* [offset, offset + 2^nbits) */
struct PrefixCodeRange {
uint16_t offset;
uint8_t nbits;
};
static const struct PrefixCodeRange kBlockLengthPrefixCode[] = {
{ 1, 2}, { 5, 2}, { 9, 2}, { 13, 2},
{ 17, 3}, { 25, 3}, { 33, 3}, { 41, 3},
{ 49, 4}, { 65, 4}, { 81, 4}, { 97, 4},
{ 113, 5}, { 145, 5}, { 177, 5}, { 209, 5},
{ 241, 6}, { 305, 6}, { 369, 7}, { 497, 8},
{ 753, 9}, { 1265, 10}, {2289, 11}, {4337, 12},
{8433, 13}, {16625, 24}
};
typedef struct CmdLutElement {
uint8_t insert_len_extra_bits;
uint8_t copy_len_extra_bits;
int8_t distance_code;
uint8_t context;
uint16_t insert_len_offset;
uint16_t copy_len_offset;
} CmdLutElement;
static const CmdLutElement kCmdLut[704] = {
const BROTLI_MODEL("small")
CmdLutElement kCmdLut[BROTLI_NUM_COMMAND_SYMBOLS] = {
{ 0x00, 0x00, 0, 0x00, 0x0000, 0x0002 },
{ 0x00, 0x00, 0, 0x01, 0x0000, 0x0003 },
{ 0x00, 0x00, 0, 0x02, 0x0000, 0x0004 },
@@ -745,5 +705,3 @@ static const CmdLutElement kCmdLut[704] = {
{ 0x18, 0x0a, -1, 0x03, 0x5842, 0x0446 },
{ 0x18, 0x18, -1, 0x03, 0x5842, 0x0846 },
};
#endif /* BROTLI_DEC_PREFIX_H_ */

186
c/dec/state.c Normal file
View File

@@ -0,0 +1,186 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "state.h"
#include "../common/dictionary.h"
#include "../common/platform.h"
#include "huffman.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#ifdef BROTLI_REPORTING
/* When BROTLI_REPORTING is defined extra reporting module have to be linked. */
void BrotliDecoderOnStart(const BrotliDecoderState* s);
void BrotliDecoderOnFinish(const BrotliDecoderState* s);
#define BROTLI_DECODER_ON_START(s) BrotliDecoderOnStart(s);
#define BROTLI_DECODER_ON_FINISH(s) BrotliDecoderOnFinish(s);
#else
#if !defined(BROTLI_DECODER_ON_START)
#define BROTLI_DECODER_ON_START(s) (void)(s);
#endif
#if !defined(BROTLI_DECODER_ON_FINISH)
#define BROTLI_DECODER_ON_FINISH(s) (void)(s);
#endif
#endif
BROTLI_BOOL BrotliDecoderStateInit(BrotliDecoderState* s,
brotli_alloc_func alloc_func, brotli_free_func free_func, void* opaque) {
BROTLI_DECODER_ON_START(s);
if (!alloc_func) {
s->alloc_func = BrotliDefaultAllocFunc;
s->free_func = BrotliDefaultFreeFunc;
s->memory_manager_opaque = 0;
} else {
s->alloc_func = alloc_func;
s->free_func = free_func;
s->memory_manager_opaque = opaque;
}
s->error_code = 0; /* BROTLI_DECODER_NO_ERROR */
BrotliInitBitReader(&s->br);
s->state = BROTLI_STATE_UNINITED;
s->large_window = 0;
s->substate_metablock_header = BROTLI_STATE_METABLOCK_HEADER_NONE;
s->substate_uncompressed = BROTLI_STATE_UNCOMPRESSED_NONE;
s->substate_decode_uint8 = BROTLI_STATE_DECODE_UINT8_NONE;
s->substate_read_block_length = BROTLI_STATE_READ_BLOCK_LENGTH_NONE;
s->buffer_length = 0;
s->loop_counter = 0;
s->pos = 0;
s->rb_roundtrips = 0;
s->partial_pos_out = 0;
s->used_input = 0;
s->block_type_trees = NULL;
s->block_len_trees = NULL;
s->ringbuffer = NULL;
s->ringbuffer_size = 0;
s->new_ringbuffer_size = 0;
s->ringbuffer_mask = 0;
s->context_map = NULL;
s->context_modes = NULL;
s->dist_context_map = NULL;
s->context_map_slice = NULL;
s->dist_context_map_slice = NULL;
s->literal_hgroup.codes = NULL;
s->literal_hgroup.htrees = NULL;
s->insert_copy_hgroup.codes = NULL;
s->insert_copy_hgroup.htrees = NULL;
s->distance_hgroup.codes = NULL;
s->distance_hgroup.htrees = NULL;
s->is_last_metablock = 0;
s->is_uncompressed = 0;
s->is_metadata = 0;
s->should_wrap_ringbuffer = 0;
s->canny_ringbuffer_allocation = 1;
s->window_bits = 0;
s->max_distance = 0;
s->dist_rb[0] = 16;
s->dist_rb[1] = 15;
s->dist_rb[2] = 11;
s->dist_rb[3] = 4;
s->dist_rb_idx = 0;
s->block_type_trees = NULL;
s->block_len_trees = NULL;
s->mtf_upper_bound = 63;
s->compound_dictionary = NULL;
s->dictionary =
BrotliSharedDictionaryCreateInstance(alloc_func, free_func, opaque);
if (!s->dictionary) return BROTLI_FALSE;
s->metadata_start_func = NULL;
s->metadata_chunk_func = NULL;
s->metadata_callback_opaque = 0;
return BROTLI_TRUE;
}
void BrotliDecoderStateMetablockBegin(BrotliDecoderState* s) {
s->meta_block_remaining_len = 0;
s->block_length[0] = BROTLI_BLOCK_SIZE_CAP;
s->block_length[1] = BROTLI_BLOCK_SIZE_CAP;
s->block_length[2] = BROTLI_BLOCK_SIZE_CAP;
s->num_block_types[0] = 1;
s->num_block_types[1] = 1;
s->num_block_types[2] = 1;
s->block_type_rb[0] = 1;
s->block_type_rb[1] = 0;
s->block_type_rb[2] = 1;
s->block_type_rb[3] = 0;
s->block_type_rb[4] = 1;
s->block_type_rb[5] = 0;
s->context_map = NULL;
s->context_modes = NULL;
s->dist_context_map = NULL;
s->context_map_slice = NULL;
s->literal_htree = NULL;
s->dist_context_map_slice = NULL;
s->dist_htree_index = 0;
s->context_lookup = NULL;
s->literal_hgroup.codes = NULL;
s->literal_hgroup.htrees = NULL;
s->insert_copy_hgroup.codes = NULL;
s->insert_copy_hgroup.htrees = NULL;
s->distance_hgroup.codes = NULL;
s->distance_hgroup.htrees = NULL;
}
void BrotliDecoderStateCleanupAfterMetablock(BrotliDecoderState* s) {
BROTLI_DECODER_FREE(s, s->context_modes);
BROTLI_DECODER_FREE(s, s->context_map);
BROTLI_DECODER_FREE(s, s->dist_context_map);
BROTLI_DECODER_FREE(s, s->literal_hgroup.htrees);
BROTLI_DECODER_FREE(s, s->insert_copy_hgroup.htrees);
BROTLI_DECODER_FREE(s, s->distance_hgroup.htrees);
}
void BrotliDecoderStateCleanup(BrotliDecoderState* s) {
BrotliDecoderStateCleanupAfterMetablock(s);
BROTLI_DECODER_ON_FINISH(s);
BROTLI_DECODER_FREE(s, s->compound_dictionary);
BrotliSharedDictionaryDestroyInstance(s->dictionary);
s->dictionary = NULL;
BROTLI_DECODER_FREE(s, s->ringbuffer);
BROTLI_DECODER_FREE(s, s->block_type_trees);
}
BROTLI_BOOL BrotliDecoderHuffmanTreeGroupInit(BrotliDecoderState* s,
HuffmanTreeGroup* group, brotli_reg_t alphabet_size_max,
brotli_reg_t alphabet_size_limit, brotli_reg_t ntrees) {
/* 376 = 256 (1-st level table) + 4 + 7 + 15 + 31 + 63 (2-nd level mix-tables)
This number is discovered "unlimited" "enough" calculator; it is actually
a wee bigger than required in several cases (especially for alphabets with
less than 16 symbols). */
const size_t max_table_size = alphabet_size_limit + 376;
const size_t code_size = sizeof(HuffmanCode) * ntrees * max_table_size;
const size_t htree_size = sizeof(HuffmanCode*) * ntrees;
/* Pointer alignment is, hopefully, wider than sizeof(HuffmanCode). */
HuffmanCode** p = (HuffmanCode**)BROTLI_DECODER_ALLOC(s,
code_size + htree_size);
group->alphabet_size_max = (uint16_t)alphabet_size_max;
group->alphabet_size_limit = (uint16_t)alphabet_size_limit;
group->num_htrees = (uint16_t)ntrees;
group->htrees = p;
group->codes = p ? (HuffmanCode*)(&p[ntrees]) : NULL;
return !!p;
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

396
c/dec/state.h Normal file
View File

@@ -0,0 +1,396 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Brotli state for partial streaming decoding. */
#ifndef BROTLI_DEC_STATE_H_
#define BROTLI_DEC_STATE_H_
#include "../common/constants.h"
#include "../common/platform.h"
#include <brotli/shared_dictionary.h>
#include <brotli/decode.h>
#include "bit_reader.h"
#include "huffman.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* Graphviz diagram that describes state transitions:
digraph States {
graph [compound=true]
concentrate=true
node [shape="box"]
UNINITED -> {LARGE_WINDOW_BITS -> INITIALIZE}
subgraph cluster_metablock_workflow {
style="rounded"
label=< <B>METABLOCK CYCLE</B> >
METABLOCK_BEGIN -> METABLOCK_HEADER
METABLOCK_HEADER:sw -> METADATA
METABLOCK_HEADER:s -> UNCOMPRESSED
METABLOCK_HEADER:se -> METABLOCK_DONE:ne
METADATA:s -> METABLOCK_DONE:w
UNCOMPRESSED:s -> METABLOCK_DONE:n
METABLOCK_DONE:e -> METABLOCK_BEGIN:e [constraint="false"]
}
INITIALIZE -> METABLOCK_BEGIN
METABLOCK_DONE -> DONE
subgraph cluster_compressed_metablock {
style="rounded"
label=< <B>COMPRESSED METABLOCK</B> >
subgraph cluster_command {
style="rounded"
label=< <B>HOT LOOP</B> >
_METABLOCK_DONE_PORT_ [shape=point style=invis]
{
// Set different shape for nodes returning from "compressed metablock".
node [shape=invhouse]; CMD_INNER CMD_POST_DECODE_LITERALS;
CMD_POST_WRAP_COPY; CMD_INNER_WRITE; CMD_POST_WRITE_1;
}
CMD_BEGIN -> CMD_INNER -> CMD_POST_DECODE_LITERALS -> CMD_POST_WRAP_COPY
// IO ("write") nodes are not in the hot loop!
CMD_INNER_WRITE [style=dashed]
CMD_INNER -> CMD_INNER_WRITE
CMD_POST_WRITE_1 [style=dashed]
CMD_POST_DECODE_LITERALS -> CMD_POST_WRITE_1
CMD_POST_WRITE_2 [style=dashed]
CMD_POST_WRAP_COPY -> CMD_POST_WRITE_2
CMD_POST_WRITE_1 -> CMD_BEGIN:s [constraint="false"]
CMD_INNER_WRITE -> {CMD_INNER CMD_POST_DECODE_LITERALS}
[constraint="false"]
CMD_BEGIN:ne -> CMD_POST_DECODE_LITERALS [constraint="false"]
CMD_POST_WRAP_COPY -> CMD_BEGIN [constraint="false"]
CMD_POST_DECODE_LITERALS -> CMD_BEGIN:ne [constraint="false"]
CMD_POST_WRITE_2 -> CMD_POST_WRAP_COPY [constraint="false"]
{rank=same; CMD_BEGIN; CMD_INNER; CMD_POST_DECODE_LITERALS;
CMD_POST_WRAP_COPY}
{rank=same; CMD_INNER_WRITE; CMD_POST_WRITE_1; CMD_POST_WRITE_2}
{CMD_INNER CMD_POST_DECODE_LITERALS CMD_POST_WRAP_COPY} ->
_METABLOCK_DONE_PORT_ [style=invis]
{CMD_INNER_WRITE CMD_POST_WRITE_1} -> _METABLOCK_DONE_PORT_
[constraint="false" style=invis]
}
BEFORE_COMPRESSED_METABLOCK_HEADER:s -> HUFFMAN_CODE_0:n
HUFFMAN_CODE_0 -> HUFFMAN_CODE_1 -> HUFFMAN_CODE_2 -> HUFFMAN_CODE_3
HUFFMAN_CODE_0 -> METABLOCK_HEADER_2 -> CONTEXT_MODES -> CONTEXT_MAP_1
CONTEXT_MAP_1 -> CONTEXT_MAP_2 -> TREE_GROUP
TREE_GROUP -> BEFORE_COMPRESSED_METABLOCK_BODY:e
BEFORE_COMPRESSED_METABLOCK_BODY:s -> CMD_BEGIN:n
HUFFMAN_CODE_3:e -> HUFFMAN_CODE_0:ne [constraint="false"]
{rank=same; HUFFMAN_CODE_0; HUFFMAN_CODE_1; HUFFMAN_CODE_2; HUFFMAN_CODE_3}
{rank=same; METABLOCK_HEADER_2; CONTEXT_MODES; CONTEXT_MAP_1; CONTEXT_MAP_2;
TREE_GROUP}
}
METABLOCK_HEADER:e -> BEFORE_COMPRESSED_METABLOCK_HEADER:n
_METABLOCK_DONE_PORT_ -> METABLOCK_DONE:se
[constraint="false" ltail=cluster_command]
UNINITED [shape=Mdiamond];
DONE [shape=Msquare];
}
*/
typedef enum {
BROTLI_STATE_UNINITED,
BROTLI_STATE_LARGE_WINDOW_BITS,
BROTLI_STATE_INITIALIZE,
BROTLI_STATE_METABLOCK_BEGIN,
BROTLI_STATE_METABLOCK_HEADER,
BROTLI_STATE_METABLOCK_HEADER_2,
BROTLI_STATE_CONTEXT_MODES,
BROTLI_STATE_COMMAND_BEGIN,
BROTLI_STATE_COMMAND_INNER,
BROTLI_STATE_COMMAND_POST_DECODE_LITERALS,
BROTLI_STATE_COMMAND_POST_WRAP_COPY,
BROTLI_STATE_UNCOMPRESSED,
BROTLI_STATE_METADATA,
BROTLI_STATE_COMMAND_INNER_WRITE,
BROTLI_STATE_METABLOCK_DONE,
BROTLI_STATE_COMMAND_POST_WRITE_1,
BROTLI_STATE_COMMAND_POST_WRITE_2,
BROTLI_STATE_BEFORE_COMPRESSED_METABLOCK_HEADER,
BROTLI_STATE_HUFFMAN_CODE_0,
BROTLI_STATE_HUFFMAN_CODE_1,
BROTLI_STATE_HUFFMAN_CODE_2,
BROTLI_STATE_HUFFMAN_CODE_3,
BROTLI_STATE_CONTEXT_MAP_1,
BROTLI_STATE_CONTEXT_MAP_2,
BROTLI_STATE_TREE_GROUP,
BROTLI_STATE_BEFORE_COMPRESSED_METABLOCK_BODY,
BROTLI_STATE_DONE
} BrotliRunningState;
typedef enum {
BROTLI_STATE_METABLOCK_HEADER_NONE,
BROTLI_STATE_METABLOCK_HEADER_EMPTY,
BROTLI_STATE_METABLOCK_HEADER_NIBBLES,
BROTLI_STATE_METABLOCK_HEADER_SIZE,
BROTLI_STATE_METABLOCK_HEADER_UNCOMPRESSED,
BROTLI_STATE_METABLOCK_HEADER_RESERVED,
BROTLI_STATE_METABLOCK_HEADER_BYTES,
BROTLI_STATE_METABLOCK_HEADER_METADATA
} BrotliRunningMetablockHeaderState;
typedef enum {
BROTLI_STATE_UNCOMPRESSED_NONE,
BROTLI_STATE_UNCOMPRESSED_WRITE
} BrotliRunningUncompressedState;
typedef enum {
BROTLI_STATE_TREE_GROUP_NONE,
BROTLI_STATE_TREE_GROUP_LOOP
} BrotliRunningTreeGroupState;
typedef enum {
BROTLI_STATE_CONTEXT_MAP_NONE,
BROTLI_STATE_CONTEXT_MAP_READ_PREFIX,
BROTLI_STATE_CONTEXT_MAP_HUFFMAN,
BROTLI_STATE_CONTEXT_MAP_DECODE,
BROTLI_STATE_CONTEXT_MAP_TRANSFORM
} BrotliRunningContextMapState;
typedef enum {
BROTLI_STATE_HUFFMAN_NONE,
BROTLI_STATE_HUFFMAN_SIMPLE_SIZE,
BROTLI_STATE_HUFFMAN_SIMPLE_READ,
BROTLI_STATE_HUFFMAN_SIMPLE_BUILD,
BROTLI_STATE_HUFFMAN_COMPLEX,
BROTLI_STATE_HUFFMAN_LENGTH_SYMBOLS
} BrotliRunningHuffmanState;
typedef enum {
BROTLI_STATE_DECODE_UINT8_NONE,
BROTLI_STATE_DECODE_UINT8_SHORT,
BROTLI_STATE_DECODE_UINT8_LONG
} BrotliRunningDecodeUint8State;
typedef enum {
BROTLI_STATE_READ_BLOCK_LENGTH_NONE,
BROTLI_STATE_READ_BLOCK_LENGTH_SUFFIX
} BrotliRunningReadBlockLengthState;
/* BrotliDecoderState addon, used for Compound Dictionary functionality. */
typedef struct BrotliDecoderCompoundDictionary {
int num_chunks;
int total_size;
int br_index;
int br_offset;
int br_length;
int br_copied;
const uint8_t* chunks[16];
int chunk_offsets[16];
int block_bits;
uint8_t block_map[256];
} BrotliDecoderCompoundDictionary;
typedef struct BrotliMetablockHeaderArena {
BrotliRunningTreeGroupState substate_tree_group;
BrotliRunningContextMapState substate_context_map;
BrotliRunningHuffmanState substate_huffman;
brotli_reg_t sub_loop_counter;
brotli_reg_t repeat_code_len;
brotli_reg_t prev_code_len;
/* For ReadHuffmanCode. */
brotli_reg_t symbol;
brotli_reg_t repeat;
brotli_reg_t space;
/* Huffman table for "histograms". */
HuffmanCode table[32];
/* List of heads of symbol chains. */
uint16_t* symbol_lists;
/* Storage from symbol_lists. */
uint16_t symbols_lists_array[BROTLI_HUFFMAN_MAX_CODE_LENGTH + 1 +
BROTLI_NUM_COMMAND_SYMBOLS];
/* Tails of symbol chains. */
int next_symbol[32];
uint8_t code_length_code_lengths[BROTLI_CODE_LENGTH_CODES];
/* Population counts for the code lengths. */
uint16_t code_length_histo[16];
/* TODO(eustas): +2 bytes padding */
/* For HuffmanTreeGroupDecode. */
int htree_index;
HuffmanCode* next;
/* For DecodeContextMap. */
brotli_reg_t context_index;
brotli_reg_t max_run_length_prefix;
brotli_reg_t code;
HuffmanCode context_map_table[BROTLI_HUFFMAN_MAX_SIZE_272];
} BrotliMetablockHeaderArena;
typedef struct BrotliMetablockBodyArena {
uint8_t dist_extra_bits[544];
brotli_reg_t dist_offset[544];
} BrotliMetablockBodyArena;
struct BrotliDecoderStateStruct {
BrotliRunningState state;
/* This counter is reused for several disjoint loops. */
int loop_counter;
BrotliBitReader br;
brotli_alloc_func alloc_func;
brotli_free_func free_func;
void* memory_manager_opaque;
/* Temporary storage for remaining input. Brotli stream format is designed in
a way, that 64 bits are enough to make progress in decoding. */
union {
uint64_t u64;
uint8_t u8[8];
} buffer;
brotli_reg_t buffer_length;
int pos;
int max_backward_distance;
int max_distance;
int ringbuffer_size;
int ringbuffer_mask;
int dist_rb_idx;
int dist_rb[4];
int error_code;
int meta_block_remaining_len;
uint8_t* ringbuffer;
uint8_t* ringbuffer_end;
HuffmanCode* htree_command;
const uint8_t* context_lookup;
uint8_t* context_map_slice;
uint8_t* dist_context_map_slice;
/* This ring buffer holds a few past copy distances that will be used by
some special distance codes. */
HuffmanTreeGroup literal_hgroup;
HuffmanTreeGroup insert_copy_hgroup;
HuffmanTreeGroup distance_hgroup;
HuffmanCode* block_type_trees;
HuffmanCode* block_len_trees;
/* This is true if the literal context map histogram type always matches the
block type. It is then not needed to keep the context (faster decoding). */
int trivial_literal_context;
/* Distance context is actual after command is decoded and before distance is
computed. After distance computation it is used as a temporary variable. */
int distance_context;
brotli_reg_t block_length[3];
brotli_reg_t block_length_index;
brotli_reg_t num_block_types[3];
brotli_reg_t block_type_rb[6];
brotli_reg_t distance_postfix_bits;
brotli_reg_t num_direct_distance_codes;
brotli_reg_t num_dist_htrees;
uint8_t* dist_context_map;
HuffmanCode* literal_htree;
/* For partial write operations. */
size_t rb_roundtrips; /* how many times we went around the ring-buffer */
size_t partial_pos_out; /* how much output to the user in total */
/* For InverseMoveToFrontTransform. */
brotli_reg_t mtf_upper_bound;
uint32_t mtf[64 + 1];
int copy_length;
int distance_code;
uint8_t dist_htree_index;
/* TODO(eustas): +3 bytes padding */
/* Less used attributes are at the end of this struct. */
brotli_decoder_metadata_start_func metadata_start_func;
brotli_decoder_metadata_chunk_func metadata_chunk_func;
void* metadata_callback_opaque;
/* For reporting. */
uint64_t used_input; /* how many bytes of input are consumed */
/* States inside function calls. */
BrotliRunningMetablockHeaderState substate_metablock_header;
BrotliRunningUncompressedState substate_uncompressed;
BrotliRunningDecodeUint8State substate_decode_uint8;
BrotliRunningReadBlockLengthState substate_read_block_length;
int new_ringbuffer_size;
/* TODO(eustas): +4 bytes padding */
unsigned int is_last_metablock : 1;
unsigned int is_uncompressed : 1;
unsigned int is_metadata : 1;
unsigned int should_wrap_ringbuffer : 1;
unsigned int canny_ringbuffer_allocation : 1;
unsigned int large_window : 1;
unsigned int window_bits : 6;
unsigned int size_nibbles : 8;
/* TODO(eustas): +12 bits padding */
brotli_reg_t num_literal_htrees;
uint8_t* context_map;
uint8_t* context_modes;
BrotliSharedDictionary* dictionary;
BrotliDecoderCompoundDictionary* compound_dictionary;
uint32_t trivial_literal_contexts[8]; /* 256 bits */
union {
BrotliMetablockHeaderArena header;
BrotliMetablockBodyArena body;
} arena;
};
typedef struct BrotliDecoderStateStruct BrotliDecoderStateInternal;
#define BrotliDecoderState BrotliDecoderStateInternal
BROTLI_INTERNAL BROTLI_BOOL BrotliDecoderStateInit(BrotliDecoderState* s,
brotli_alloc_func alloc_func, brotli_free_func free_func, void* opaque);
BROTLI_INTERNAL void BrotliDecoderStateCleanup(BrotliDecoderState* s);
BROTLI_INTERNAL void BrotliDecoderStateMetablockBegin(BrotliDecoderState* s);
BROTLI_INTERNAL void BrotliDecoderStateCleanupAfterMetablock(
BrotliDecoderState* s);
BROTLI_INTERNAL BROTLI_BOOL BrotliDecoderHuffmanTreeGroupInit(
BrotliDecoderState* s, HuffmanTreeGroup* group,
brotli_reg_t alphabet_size_max, brotli_reg_t alphabet_size_limit,
brotli_reg_t ntrees);
#define BROTLI_DECODER_ALLOC(S, L) S->alloc_func(S->memory_manager_opaque, L)
#define BROTLI_DECODER_FREE(S, X) { \
S->free_func(S->memory_manager_opaque, X); \
X = NULL; \
}
/* Literal/Command/Distance block size maximum; same as maximum metablock size;
used as block size when there is no block switching. */
#define BROTLI_BLOCK_SIZE_CAP (1U << 24)
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_DEC_STATE_H_ */

53
c/dec/static_init.c Normal file
View File

@@ -0,0 +1,53 @@
/* Copyright 2025 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "static_init.h"
#include "../common/platform.h"
#include "../common/static_init.h"
#if (BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_NONE)
#include "../common/dictionary.h"
#include "prefix.h"
#endif
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#if (BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_NONE)
static BROTLI_BOOL DoBrotliDecoderStaticInit(void) {
BROTLI_BOOL ok = BrotliDecoderInitCmdLut(kCmdLut);
if (!ok) return BROTLI_FALSE;
return BROTLI_TRUE;
}
#endif /* BROTLI_STATIC_INIT_NONE */
#if (BROTLI_STATIC_INIT == BROTLI_STATIC_INIT_EARLY)
static BROTLI_BOOL kEarlyInitOk;
static __attribute__((constructor)) void BrotliDecoderStaticInitEarly(void) {
kEarlyInitOk = DoBrotliDecoderStaticInit();
}
#elif (BROTLI_STATIC_INIT == BROTLI_STATIC_INIT_LAZY)
static BROTLI_BOOL kLazyInitOk;
void BrotliDecoderLazyStaticInitInner(void) {
kLazyInitOk = DoBrotliDecoderStaticInit();
}
#endif /* BROTLI_STATIC_INIT_EARLY */
BROTLI_BOOL BrotliDecoderEnsureStaticInit(void) {
#if (BROTLI_STATIC_INIT == BROTLI_STATIC_INIT_NONE)
return BROTLI_TRUE;
#elif (BROTLI_STATIC_INIT == BROTLI_STATIC_INIT_EARLY)
return kEarlyInitOk;
#else
return kLazyInitOk;
#endif
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

30
c/dec/static_init.h Normal file
View File

@@ -0,0 +1,30 @@
/* Copyright 2025 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Central point for static initialization. */
#ifndef THIRD_PARTY_BROTLI_DEC_STATIC_INIT_H_
#define THIRD_PARTY_BROTLI_DEC_STATIC_INIT_H_
#include "../common/platform.h"
#include "../common/static_init.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#if (BROTLI_STATIC_INIT == BROTLI_STATIC_INIT_LAZY)
BROTLI_INTERNAL void BrotliDecoderLazyStaticInitInner(void);
BROTLI_INTERNAL void BrotliDecoderLazyStaticInit(void);
#endif /* BROTLI_STATIC_INIT */
BROTLI_INTERNAL BROTLI_BOOL BrotliDecoderEnsureStaticInit(void);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif // THIRD_PARTY_BROTLI_DEC_STATIC_INIT_H_

231
c/enc/backward_references.c Normal file
View File

@@ -0,0 +1,231 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function to find backward reference copies. */
#include "backward_references.h"
#include "../common/constants.h"
#include "../common/context.h"
#include "../common/platform.h"
#include "command.h"
#include "compound_dictionary.h"
#include "encoder_dict.h"
#include "hash.h"
#include "params.h"
#include "quality.h" /* IWYU pragma: keep for inc */
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
static BROTLI_INLINE size_t ComputeDistanceCode(size_t distance,
size_t max_distance,
const int* dist_cache) {
if (distance <= max_distance) {
size_t distance_plus_3 = distance + 3;
size_t offset0 = distance_plus_3 - (size_t)dist_cache[0];
size_t offset1 = distance_plus_3 - (size_t)dist_cache[1];
if (distance == (size_t)dist_cache[0]) {
return 0;
} else if (distance == (size_t)dist_cache[1]) {
return 1;
} else if (offset0 < 7) {
return (0x9750468 >> (4 * offset0)) & 0xF;
} else if (offset1 < 7) {
return (0xFDB1ACE >> (4 * offset1)) & 0xF;
} else if (distance == (size_t)dist_cache[2]) {
return 2;
} else if (distance == (size_t)dist_cache[3]) {
return 3;
}
}
return distance + BROTLI_NUM_DISTANCE_SHORT_CODES - 1;
}
#define EXPAND_CAT(a, b) CAT(a, b)
#define CAT(a, b) a ## b
#define FN(X) EXPAND_CAT(X, HASHER())
#define EXPORT_FN(X) EXPAND_CAT(X, EXPAND_CAT(PREFIX(), HASHER()))
#define PREFIX() N
#define ENABLE_COMPOUND_DICTIONARY 0
#define HASHER() H2
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H3
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H4
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H5
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H6
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H40
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H41
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H42
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H54
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H35
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H55
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H65
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#if defined(BROTLI_MAX_SIMD_QUALITY)
#define HASHER() H58
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H68
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#endif
#undef ENABLE_COMPOUND_DICTIONARY
#undef PREFIX
#define PREFIX() D
#define ENABLE_COMPOUND_DICTIONARY 1
#define HASHER() H5
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H6
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H40
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H41
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H42
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H55
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H65
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#if defined(BROTLI_MAX_SIMD_QUALITY)
#define HASHER() H58
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#define HASHER() H68
/* NOLINTNEXTLINE(build/include) */
#include "backward_references_inc.h"
#undef HASHER
#endif
#undef ENABLE_COMPOUND_DICTIONARY
#undef PREFIX
#undef EXPORT_FN
#undef FN
#undef CAT
#undef EXPAND_CAT
void BrotliCreateBackwardReferences(size_t num_bytes,
size_t position, const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
Hasher* hasher, int* dist_cache, size_t* last_insert_len,
Command* commands, size_t* num_commands, size_t* num_literals) {
if (params->dictionary.compound.num_chunks != 0) {
switch (params->hasher.type) {
#define CASE_(N) \
case N: \
CreateBackwardReferencesDH ## N(num_bytes, \
position, ringbuffer, ringbuffer_mask, \
literal_context_lut, params, hasher, dist_cache, \
last_insert_len, commands, num_commands, num_literals); \
return;
CASE_(5)
CASE_(6)
#if defined(BROTLI_MAX_SIMD_QUALITY)
CASE_(58)
CASE_(68)
#endif
CASE_(40)
CASE_(41)
CASE_(42)
CASE_(55)
CASE_(65)
#undef CASE_
default:
BROTLI_DCHECK(BROTLI_FALSE);
break;
}
}
switch (params->hasher.type) {
#define CASE_(N) \
case N: \
CreateBackwardReferencesNH ## N(num_bytes, \
position, ringbuffer, ringbuffer_mask, \
literal_context_lut, params, hasher, dist_cache, \
last_insert_len, commands, num_commands, num_literals); \
return;
FOR_GENERIC_HASHERS(CASE_)
#undef CASE_
default:
BROTLI_DCHECK(BROTLI_FALSE);
break;
}
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

View File

@@ -0,0 +1,36 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function to find backward reference copies. */
#ifndef BROTLI_ENC_BACKWARD_REFERENCES_H_
#define BROTLI_ENC_BACKWARD_REFERENCES_H_
#include "../common/context.h"
#include "../common/platform.h"
#include "command.h"
#include "hash.h"
#include "params.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* "commands" points to the next output command to write to, "*num_commands" is
initially the total amount of commands output by previous
CreateBackwardReferences calls, and must be incremented by the amount written
by this call. */
BROTLI_INTERNAL void BrotliCreateBackwardReferences(size_t num_bytes,
size_t position, const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
Hasher* hasher, int* dist_cache, size_t* last_insert_len,
Command* commands, size_t* num_commands, size_t* num_literals);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_BACKWARD_REFERENCES_H_ */

View File

@@ -0,0 +1,939 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function to find backward reference copies. */
#include "backward_references_hq.h"
#include "../common/constants.h"
#include "../common/context.h"
#include "../common/platform.h"
#include "command.h"
#include "compound_dictionary.h"
#include "encoder_dict.h"
#include "fast_log.h"
#include "find_match_length.h"
#include "hash.h"
#include "literal_cost.h"
#include "memory.h"
#include "params.h"
#include "prefix.h"
#include "quality.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* BrotliCalculateDistanceCodeLimit(BROTLI_MAX_ALLOWED_DISTANCE, 3, 120). */
#define BROTLI_MAX_EFFECTIVE_DISTANCE_ALPHABET_SIZE 544
static const float kInfinity = 1.7e38f; /* ~= 2 ^ 127 */
static const BROTLI_MODEL("small") uint32_t kDistanceCacheIndex[] = {
0, 1, 2, 3, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1,
};
static const BROTLI_MODEL("small") int kDistanceCacheOffset[] = {
0, 0, 0, 0, -1, 1, -2, 2, -3, 3, -1, 1, -2, 2, -3, 3
};
void BrotliInitZopfliNodes(ZopfliNode* array, size_t length) {
ZopfliNode stub;
size_t i;
stub.length = 1;
stub.distance = 0;
stub.dcode_insert_length = 0;
stub.u.cost = kInfinity;
for (i = 0; i < length; ++i) array[i] = stub;
}
static BROTLI_INLINE uint32_t ZopfliNodeCopyLength(const ZopfliNode* self) {
return self->length & 0x1FFFFFF;
}
static BROTLI_INLINE uint32_t ZopfliNodeLengthCode(const ZopfliNode* self) {
const uint32_t modifier = self->length >> 25;
return ZopfliNodeCopyLength(self) + 9u - modifier;
}
static BROTLI_INLINE uint32_t ZopfliNodeCopyDistance(const ZopfliNode* self) {
return self->distance;
}
static BROTLI_INLINE uint32_t ZopfliNodeDistanceCode(const ZopfliNode* self) {
const uint32_t short_code = self->dcode_insert_length >> 27;
return short_code == 0 ?
ZopfliNodeCopyDistance(self) + BROTLI_NUM_DISTANCE_SHORT_CODES - 1 :
short_code - 1;
}
static BROTLI_INLINE uint32_t ZopfliNodeCommandLength(const ZopfliNode* self) {
return ZopfliNodeCopyLength(self) + (self->dcode_insert_length & 0x7FFFFFF);
}
/* Temporary data for ZopfliCostModelSetFromCommands. */
typedef struct ZopfliCostModelArena {
uint32_t histogram_literal[BROTLI_NUM_LITERAL_SYMBOLS];
uint32_t histogram_cmd[BROTLI_NUM_COMMAND_SYMBOLS];
uint32_t histogram_dist[BROTLI_MAX_EFFECTIVE_DISTANCE_ALPHABET_SIZE];
float cost_literal[BROTLI_NUM_LITERAL_SYMBOLS];
} ZopfliCostModelArena;
/* Histogram based cost model for zopflification. */
typedef struct ZopfliCostModel {
/* The insert and copy length symbols. */
float cost_cmd_[BROTLI_NUM_COMMAND_SYMBOLS];
float* cost_dist_;
uint32_t distance_histogram_size;
/* Cumulative costs of literals per position in the stream. */
float* literal_costs_;
float min_cost_cmd_;
size_t num_bytes_;
/* Temporary data. */
union {
size_t literal_histograms[3 * 256];
ZopfliCostModelArena arena;
};
} ZopfliCostModel;
static void InitZopfliCostModel(
MemoryManager* m, ZopfliCostModel* self, const BrotliDistanceParams* dist,
size_t num_bytes) {
self->num_bytes_ = num_bytes;
self->literal_costs_ = BROTLI_ALLOC(m, float, num_bytes + 2);
self->cost_dist_ = BROTLI_ALLOC(m, float, dist->alphabet_size_limit);
self->distance_histogram_size = dist->alphabet_size_limit;
if (BROTLI_IS_OOM(m)) return;
}
static void CleanupZopfliCostModel(MemoryManager* m, ZopfliCostModel* self) {
BROTLI_FREE(m, self->literal_costs_);
BROTLI_FREE(m, self->cost_dist_);
}
static void SetCost(const uint32_t* histogram, size_t histogram_size,
BROTLI_BOOL literal_histogram, float* cost) {
size_t sum = 0;
size_t missing_symbol_sum;
float log2sum;
float missing_symbol_cost;
size_t i;
for (i = 0; i < histogram_size; i++) {
sum += histogram[i];
}
log2sum = (float)FastLog2(sum);
missing_symbol_sum = sum;
if (!literal_histogram) {
for (i = 0; i < histogram_size; i++) {
if (histogram[i] == 0) missing_symbol_sum++;
}
}
missing_symbol_cost = (float)FastLog2(missing_symbol_sum) + 2;
for (i = 0; i < histogram_size; i++) {
if (histogram[i] == 0) {
cost[i] = missing_symbol_cost;
continue;
}
/* Shannon bits for this symbol. */
cost[i] = log2sum - (float)FastLog2(histogram[i]);
/* Cannot be coded with less than 1 bit */
if (cost[i] < 1) cost[i] = 1;
}
}
static void ZopfliCostModelSetFromCommands(ZopfliCostModel* self,
size_t position,
const uint8_t* ringbuffer,
size_t ringbuffer_mask,
const Command* commands,
size_t num_commands,
size_t last_insert_len) {
ZopfliCostModelArena* arena = &self->arena;
size_t pos = position - last_insert_len;
float min_cost_cmd = kInfinity;
size_t i;
float* cost_cmd = self->cost_cmd_;
memset(arena->histogram_literal, 0, sizeof(arena->histogram_literal));
memset(arena->histogram_cmd, 0, sizeof(arena->histogram_cmd));
memset(arena->histogram_dist, 0, sizeof(arena->histogram_dist));
for (i = 0; i < num_commands; i++) {
size_t inslength = commands[i].insert_len_;
size_t copylength = CommandCopyLen(&commands[i]);
size_t distcode = commands[i].dist_prefix_ & 0x3FF;
size_t cmdcode = commands[i].cmd_prefix_;
size_t j;
arena->histogram_cmd[cmdcode]++;
if (cmdcode >= 128) arena->histogram_dist[distcode]++;
for (j = 0; j < inslength; j++) {
arena->histogram_literal[ringbuffer[(pos + j) & ringbuffer_mask]]++;
}
pos += inslength + copylength;
}
SetCost(arena->histogram_literal, BROTLI_NUM_LITERAL_SYMBOLS, BROTLI_TRUE,
arena->cost_literal);
SetCost(arena->histogram_cmd, BROTLI_NUM_COMMAND_SYMBOLS, BROTLI_FALSE,
cost_cmd);
SetCost(arena->histogram_dist, self->distance_histogram_size, BROTLI_FALSE,
self->cost_dist_);
for (i = 0; i < BROTLI_NUM_COMMAND_SYMBOLS; ++i) {
min_cost_cmd = BROTLI_MIN(float, min_cost_cmd, cost_cmd[i]);
}
self->min_cost_cmd_ = min_cost_cmd;
{
float* literal_costs = self->literal_costs_;
float literal_carry = 0.0;
size_t num_bytes = self->num_bytes_;
literal_costs[0] = 0.0;
for (i = 0; i < num_bytes; ++i) {
literal_carry +=
arena->cost_literal[ringbuffer[(position + i) & ringbuffer_mask]];
literal_costs[i + 1] = literal_costs[i] + literal_carry;
literal_carry -= literal_costs[i + 1] - literal_costs[i];
}
}
}
static void ZopfliCostModelSetFromLiteralCosts(ZopfliCostModel* self,
size_t position,
const uint8_t* ringbuffer,
size_t ringbuffer_mask) {
float* literal_costs = self->literal_costs_;
float literal_carry = 0.0;
float* cost_dist = self->cost_dist_;
float* cost_cmd = self->cost_cmd_;
size_t num_bytes = self->num_bytes_;
size_t i;
BrotliEstimateBitCostsForLiterals(position, num_bytes, ringbuffer_mask,
ringbuffer, self->literal_histograms,
&literal_costs[1]);
literal_costs[0] = 0.0;
for (i = 0; i < num_bytes; ++i) {
literal_carry += literal_costs[i + 1];
literal_costs[i + 1] = literal_costs[i] + literal_carry;
literal_carry -= literal_costs[i + 1] - literal_costs[i];
}
for (i = 0; i < BROTLI_NUM_COMMAND_SYMBOLS; ++i) {
cost_cmd[i] = (float)FastLog2(11 + (uint32_t)i);
}
for (i = 0; i < self->distance_histogram_size; ++i) {
cost_dist[i] = (float)FastLog2(20 + (uint32_t)i);
}
self->min_cost_cmd_ = (float)FastLog2(11);
}
static BROTLI_INLINE float ZopfliCostModelGetCommandCost(
const ZopfliCostModel* self, uint16_t cmdcode) {
return self->cost_cmd_[cmdcode];
}
static BROTLI_INLINE float ZopfliCostModelGetDistanceCost(
const ZopfliCostModel* self, size_t distcode) {
return self->cost_dist_[distcode];
}
static BROTLI_INLINE float ZopfliCostModelGetLiteralCosts(
const ZopfliCostModel* self, size_t from, size_t to) {
return self->literal_costs_[to] - self->literal_costs_[from];
}
static BROTLI_INLINE float ZopfliCostModelGetMinCostCmd(
const ZopfliCostModel* self) {
return self->min_cost_cmd_;
}
/* REQUIRES: len >= 2, start_pos <= pos */
/* REQUIRES: cost < kInfinity, nodes[start_pos].cost < kInfinity */
/* Maintains the "ZopfliNode array invariant". */
static BROTLI_INLINE void UpdateZopfliNode(ZopfliNode* nodes, size_t pos,
size_t start_pos, size_t len, size_t len_code, size_t dist,
size_t short_code, float cost) {
ZopfliNode* next = &nodes[pos + len];
next->length = (uint32_t)(len | ((len + 9u - len_code) << 25));
next->distance = (uint32_t)dist;
next->dcode_insert_length = (uint32_t)(
(short_code << 27) | (pos - start_pos));
next->u.cost = cost;
}
typedef struct PosData {
size_t pos;
int distance_cache[4];
float costdiff;
float cost;
} PosData;
/* Maintains the smallest 8 cost difference together with their positions */
typedef struct StartPosQueue {
PosData q_[8];
size_t idx_;
} StartPosQueue;
static BROTLI_INLINE void InitStartPosQueue(StartPosQueue* self) {
self->idx_ = 0;
}
static size_t StartPosQueueSize(const StartPosQueue* self) {
return BROTLI_MIN(size_t, self->idx_, 8);
}
static void StartPosQueuePush(StartPosQueue* self, const PosData* posdata) {
size_t offset = ~(self->idx_++) & 7;
size_t len = StartPosQueueSize(self);
size_t i;
PosData* q = self->q_;
q[offset] = *posdata;
/* Restore the sorted order. In the list of |len| items at most |len - 1|
adjacent element comparisons / swaps are required. */
for (i = 1; i < len; ++i) {
if (q[offset & 7].costdiff > q[(offset + 1) & 7].costdiff) {
BROTLI_SWAP(PosData, q, offset & 7, (offset + 1) & 7);
}
++offset;
}
}
static const PosData* StartPosQueueAt(const StartPosQueue* self, size_t k) {
return &self->q_[(k - self->idx_) & 7];
}
/* Returns the minimum possible copy length that can improve the cost of any */
/* future position. */
static size_t ComputeMinimumCopyLength(const float start_cost,
const ZopfliNode* nodes,
const size_t num_bytes,
const size_t pos) {
/* Compute the minimum possible cost of reaching any future position. */
float min_cost = start_cost;
size_t len = 2;
size_t next_len_bucket = 4;
size_t next_len_offset = 10;
while (pos + len <= num_bytes && nodes[pos + len].u.cost <= min_cost) {
/* We already reached (pos + len) with no more cost than the minimum
possible cost of reaching anything from this pos, so there is no point in
looking for lengths <= len. */
++len;
if (len == next_len_offset) {
/* We reached the next copy length code bucket, so we add one more
extra bit to the minimum cost. */
min_cost += 1.0f;
next_len_offset += next_len_bucket;
next_len_bucket *= 2;
}
}
return len;
}
/* REQUIRES: nodes[pos].cost < kInfinity
REQUIRES: nodes[0..pos] satisfies that "ZopfliNode array invariant". */
static uint32_t ComputeDistanceShortcut(const size_t block_start,
const size_t pos,
const size_t max_backward_limit,
const size_t gap,
const ZopfliNode* nodes) {
const size_t c_len = ZopfliNodeCopyLength(&nodes[pos]);
const size_t i_len = nodes[pos].dcode_insert_length & 0x7FFFFFF;
const size_t dist = ZopfliNodeCopyDistance(&nodes[pos]);
/* Since |block_start + pos| is the end position of the command, the copy part
starts from |block_start + pos - c_len|. Distances that are greater than
this or greater than |max_backward_limit| + |gap| are static dictionary
references, and do not update the last distances.
Also distance code 0 (last distance) does not update the last distances. */
if (pos == 0) {
return 0;
} else if (dist + c_len <= block_start + pos + gap &&
dist <= max_backward_limit + gap &&
ZopfliNodeDistanceCode(&nodes[pos]) > 0) {
return (uint32_t)pos;
} else {
return nodes[pos - c_len - i_len].u.shortcut;
}
}
/* Fills in dist_cache[0..3] with the last four distances (as defined by
Section 4. of the Spec) that would be used at (block_start + pos) if we
used the shortest path of commands from block_start, computed from
nodes[0..pos]. The last four distances at block_start are in
starting_dist_cache[0..3].
REQUIRES: nodes[pos].cost < kInfinity
REQUIRES: nodes[0..pos] satisfies that "ZopfliNode array invariant". */
static void ComputeDistanceCache(const size_t pos,
const int* starting_dist_cache,
const ZopfliNode* nodes,
int* dist_cache) {
int idx = 0;
size_t p = nodes[pos].u.shortcut;
while (idx < 4 && p > 0) {
const size_t i_len = nodes[p].dcode_insert_length & 0x7FFFFFF;
const size_t c_len = ZopfliNodeCopyLength(&nodes[p]);
const size_t dist = ZopfliNodeCopyDistance(&nodes[p]);
dist_cache[idx++] = (int)dist;
/* Because of prerequisite, p >= c_len + i_len >= 2. */
p = nodes[p - c_len - i_len].u.shortcut;
}
for (; idx < 4; ++idx) {
dist_cache[idx] = *starting_dist_cache++;
}
}
/* Maintains "ZopfliNode array invariant" and pushes node to the queue, if it
is eligible. */
static void EvaluateNode(
const size_t block_start, const size_t pos, const size_t max_backward_limit,
const size_t gap, const int* starting_dist_cache,
const ZopfliCostModel* model, StartPosQueue* queue, ZopfliNode* nodes) {
/* Save cost, because ComputeDistanceCache invalidates it. */
float node_cost = nodes[pos].u.cost;
nodes[pos].u.shortcut = ComputeDistanceShortcut(
block_start, pos, max_backward_limit, gap, nodes);
if (node_cost <= ZopfliCostModelGetLiteralCosts(model, 0, pos)) {
PosData posdata;
posdata.pos = pos;
posdata.cost = node_cost;
posdata.costdiff = node_cost -
ZopfliCostModelGetLiteralCosts(model, 0, pos);
ComputeDistanceCache(
pos, starting_dist_cache, nodes, posdata.distance_cache);
StartPosQueuePush(queue, &posdata);
}
}
/* Returns longest copy length. */
static size_t UpdateNodes(
const size_t num_bytes, const size_t block_start, const size_t pos,
const uint8_t* ringbuffer, const size_t ringbuffer_mask,
const BrotliEncoderParams* params, const size_t max_backward_limit,
const int* starting_dist_cache, const size_t num_matches,
const BackwardMatch* matches, const ZopfliCostModel* model,
StartPosQueue* queue, ZopfliNode* nodes) {
const size_t stream_offset = params->stream_offset;
const size_t cur_ix = block_start + pos;
const size_t cur_ix_masked = cur_ix & ringbuffer_mask;
const size_t max_distance = BROTLI_MIN(size_t, cur_ix, max_backward_limit);
const size_t dictionary_start = BROTLI_MIN(size_t,
cur_ix + stream_offset, max_backward_limit);
const size_t max_len = num_bytes - pos;
const size_t max_zopfli_len = MaxZopfliLen(params);
const size_t max_iters = MaxZopfliCandidates(params);
size_t min_len;
size_t result = 0;
size_t k;
const CompoundDictionary* addon = &params->dictionary.compound;
size_t gap = addon->total_size;
BROTLI_DCHECK(cur_ix_masked + max_len <= ringbuffer_mask);
EvaluateNode(block_start + stream_offset, pos, max_backward_limit, gap,
starting_dist_cache, model, queue, nodes);
{
const PosData* posdata = StartPosQueueAt(queue, 0);
float min_cost = (posdata->cost + ZopfliCostModelGetMinCostCmd(model) +
ZopfliCostModelGetLiteralCosts(model, posdata->pos, pos));
min_len = ComputeMinimumCopyLength(min_cost, nodes, num_bytes, pos);
}
/* Go over the command starting positions in order of increasing cost
difference. */
for (k = 0; k < max_iters && k < StartPosQueueSize(queue); ++k) {
const PosData* posdata = StartPosQueueAt(queue, k);
const size_t start = posdata->pos;
const uint16_t inscode = GetInsertLengthCode(pos - start);
const float start_costdiff = posdata->costdiff;
const float base_cost = start_costdiff + (float)GetInsertExtra(inscode) +
ZopfliCostModelGetLiteralCosts(model, 0, pos);
/* Look for last distance matches using the distance cache from this
starting position. */
size_t best_len = min_len - 1;
size_t j = 0;
for (; j < BROTLI_NUM_DISTANCE_SHORT_CODES && best_len < max_len; ++j) {
const size_t idx = kDistanceCacheIndex[j];
const size_t backward =
(size_t)(posdata->distance_cache[idx] + kDistanceCacheOffset[j]);
size_t prev_ix = cur_ix - backward;
size_t len = 0;
uint8_t continuation = ringbuffer[cur_ix_masked + best_len];
if (cur_ix_masked + best_len > ringbuffer_mask) {
break;
}
if (BROTLI_PREDICT_FALSE(backward > dictionary_start + gap)) {
/* Word dictionary -> ignore. */
continue;
}
if (backward <= max_distance) {
/* Regular backward reference. */
if (prev_ix >= cur_ix) {
continue;
}
prev_ix &= ringbuffer_mask;
if (prev_ix + best_len > ringbuffer_mask ||
continuation != ringbuffer[prev_ix + best_len]) {
continue;
}
len = FindMatchLengthWithLimit(&ringbuffer[prev_ix],
&ringbuffer[cur_ix_masked],
max_len);
} else if (backward > dictionary_start) {
size_t d = 0;
size_t offset;
size_t limit;
const uint8_t* source;
offset = dictionary_start + 1 + addon->total_size - 1;
while (offset >= backward + addon->chunk_offsets[d + 1]) d++;
source = addon->chunk_source[d];
offset = offset - addon->chunk_offsets[d] - backward;
limit = addon->chunk_offsets[d + 1] - addon->chunk_offsets[d] - offset;
limit = limit > max_len ? max_len : limit;
if (best_len >= limit ||
continuation != source[offset + best_len]) {
continue;
}
len = FindMatchLengthWithLimit(&source[offset],
&ringbuffer[cur_ix_masked],
limit);
} else {
/* "Gray" area. It is addressable by decoder, but this encoder
instance does not have that data -> should not touch it. */
continue;
}
{
const float dist_cost = base_cost +
ZopfliCostModelGetDistanceCost(model, j);
size_t l;
for (l = best_len + 1; l <= len; ++l) {
const uint16_t copycode = GetCopyLengthCode(l);
const uint16_t cmdcode =
CombineLengthCodes(inscode, copycode, j == 0);
const float cost = (cmdcode < 128 ? base_cost : dist_cost) +
(float)GetCopyExtra(copycode) +
ZopfliCostModelGetCommandCost(model, cmdcode);
if (cost < nodes[pos + l].u.cost) {
UpdateZopfliNode(nodes, pos, start, l, l, backward, j + 1, cost);
result = BROTLI_MAX(size_t, result, l);
}
best_len = l;
}
}
}
/* At higher iterations look only for new last distance matches, since
looking only for new command start positions with the same distances
does not help much. */
if (k >= 2) continue;
{
/* Loop through all possible copy lengths at this position. */
size_t len = min_len;
for (j = 0; j < num_matches; ++j) {
BackwardMatch match = matches[j];
size_t dist = match.distance;
BROTLI_BOOL is_dictionary_match =
TO_BROTLI_BOOL(dist > dictionary_start + gap);
/* We already tried all possible last distance matches, so we can use
normal distance code here. */
size_t dist_code = dist + BROTLI_NUM_DISTANCE_SHORT_CODES - 1;
uint16_t dist_symbol;
uint32_t distextra;
uint32_t distnumextra;
float dist_cost;
size_t max_match_len;
PrefixEncodeCopyDistance(
dist_code, params->dist.num_direct_distance_codes,
params->dist.distance_postfix_bits, &dist_symbol, &distextra);
distnumextra = dist_symbol >> 10;
dist_cost = base_cost + (float)distnumextra +
ZopfliCostModelGetDistanceCost(model, dist_symbol & 0x3FF);
/* Try all copy lengths up until the maximum copy length corresponding
to this distance. If the distance refers to the static dictionary, or
the maximum length is long enough, try only one maximum length. */
max_match_len = BackwardMatchLength(&match);
if (len < max_match_len &&
(is_dictionary_match || max_match_len > max_zopfli_len)) {
len = max_match_len;
}
for (; len <= max_match_len; ++len) {
const size_t len_code =
is_dictionary_match ? BackwardMatchLengthCode(&match) : len;
const uint16_t copycode = GetCopyLengthCode(len_code);
const uint16_t cmdcode = CombineLengthCodes(inscode, copycode, 0);
const float cost = dist_cost + (float)GetCopyExtra(copycode) +
ZopfliCostModelGetCommandCost(model, cmdcode);
if (cost < nodes[pos + len].u.cost) {
UpdateZopfliNode(nodes, pos, start, len, len_code, dist, 0, cost);
result = BROTLI_MAX(size_t, result, len);
}
}
}
}
}
return result;
}
static size_t ComputeShortestPathFromNodes(size_t num_bytes,
ZopfliNode* nodes) {
size_t index = num_bytes;
size_t num_commands = 0;
while ((nodes[index].dcode_insert_length & 0x7FFFFFF) == 0 &&
nodes[index].length == 1) --index;
nodes[index].u.next = BROTLI_UINT32_MAX;
while (index != 0) {
size_t len = ZopfliNodeCommandLength(&nodes[index]);
index -= len;
nodes[index].u.next = (uint32_t)len;
num_commands++;
}
return num_commands;
}
/* REQUIRES: nodes != NULL and len(nodes) >= num_bytes + 1 */
void BrotliZopfliCreateCommands(const size_t num_bytes,
const size_t block_start, const ZopfliNode* nodes, int* dist_cache,
size_t* last_insert_len, const BrotliEncoderParams* params,
Command* commands, size_t* num_literals) {
const size_t stream_offset = params->stream_offset;
const size_t max_backward_limit = BROTLI_MAX_BACKWARD_LIMIT(params->lgwin);
size_t pos = 0;
uint32_t offset = nodes[0].u.next;
size_t i;
size_t gap = params->dictionary.compound.total_size;
for (i = 0; offset != BROTLI_UINT32_MAX; i++) {
const ZopfliNode* next = &nodes[pos + offset];
size_t copy_length = ZopfliNodeCopyLength(next);
size_t insert_length = next->dcode_insert_length & 0x7FFFFFF;
pos += insert_length;
offset = next->u.next;
if (i == 0) {
insert_length += *last_insert_len;
*last_insert_len = 0;
}
{
size_t distance = ZopfliNodeCopyDistance(next);
size_t len_code = ZopfliNodeLengthCode(next);
size_t dictionary_start = BROTLI_MIN(size_t,
block_start + pos + stream_offset, max_backward_limit);
BROTLI_BOOL is_dictionary =
TO_BROTLI_BOOL(distance > dictionary_start + gap);
size_t dist_code = ZopfliNodeDistanceCode(next);
InitCommand(&commands[i], &params->dist, insert_length,
copy_length, (int)len_code - (int)copy_length, dist_code);
if (!is_dictionary && dist_code > 0) {
dist_cache[3] = dist_cache[2];
dist_cache[2] = dist_cache[1];
dist_cache[1] = dist_cache[0];
dist_cache[0] = (int)distance;
}
}
*num_literals += insert_length;
pos += copy_length;
}
*last_insert_len += num_bytes - pos;
}
static size_t ZopfliIterate(size_t num_bytes, size_t position,
const uint8_t* ringbuffer, size_t ringbuffer_mask,
const BrotliEncoderParams* params, const size_t gap, const int* dist_cache,
const ZopfliCostModel* model, const uint32_t* num_matches,
const BackwardMatch* matches, ZopfliNode* nodes) {
const size_t stream_offset = params->stream_offset;
const size_t max_backward_limit = BROTLI_MAX_BACKWARD_LIMIT(params->lgwin);
const size_t max_zopfli_len = MaxZopfliLen(params);
StartPosQueue queue;
size_t cur_match_pos = 0;
size_t i;
nodes[0].length = 0;
nodes[0].u.cost = 0;
InitStartPosQueue(&queue);
for (i = 0; i + 3 < num_bytes; i++) {
size_t skip = UpdateNodes(num_bytes, position, i, ringbuffer,
ringbuffer_mask, params, max_backward_limit, dist_cache,
num_matches[i], &matches[cur_match_pos], model, &queue, nodes);
if (skip < BROTLI_LONG_COPY_QUICK_STEP) skip = 0;
cur_match_pos += num_matches[i];
if (num_matches[i] == 1 &&
BackwardMatchLength(&matches[cur_match_pos - 1]) > max_zopfli_len) {
skip = BROTLI_MAX(size_t,
BackwardMatchLength(&matches[cur_match_pos - 1]), skip);
}
if (skip > 1) {
skip--;
while (skip) {
i++;
if (i + 3 >= num_bytes) break;
EvaluateNode(position + stream_offset, i, max_backward_limit, gap,
dist_cache, model, &queue, nodes);
cur_match_pos += num_matches[i];
skip--;
}
}
}
return ComputeShortestPathFromNodes(num_bytes, nodes);
}
static void MergeMatches(BackwardMatch* dst,
BackwardMatch* src1, size_t len1, BackwardMatch* src2, size_t len2) {
while (len1 > 0 && len2 > 0) {
size_t l1 = BackwardMatchLength(src1);
size_t l2 = BackwardMatchLength(src2);
if (l1 < l2 || ((l1 == l2) && (src1->distance < src2->distance))) {
*dst++ = *src1++;
len1--;
} else {
*dst++ = *src2++;
len2--;
}
}
while (len1-- > 0) *dst++ = *src1++;
while (len2-- > 0) *dst++ = *src2++;
}
/* REQUIRES: nodes != NULL and len(nodes) >= num_bytes + 1 */
size_t BrotliZopfliComputeShortestPath(MemoryManager* m, size_t num_bytes,
size_t position, const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
const int* dist_cache, Hasher* hasher, ZopfliNode* nodes) {
const size_t stream_offset = params->stream_offset;
const size_t max_backward_limit = BROTLI_MAX_BACKWARD_LIMIT(params->lgwin);
const size_t max_zopfli_len = MaxZopfliLen(params);
StartPosQueue queue;
BackwardMatch* BROTLI_RESTRICT matches =
BROTLI_ALLOC(m, BackwardMatch, 2 * (MAX_NUM_MATCHES_H10 + 64));
const size_t store_end = num_bytes >= StoreLookaheadH10() ?
position + num_bytes - StoreLookaheadH10() + 1 : position;
size_t i;
const CompoundDictionary* addon = &params->dictionary.compound;
size_t gap = addon->total_size;
size_t lz_matches_offset =
(addon->num_chunks != 0) ? (MAX_NUM_MATCHES_H10 + 128) : 0;
ZopfliCostModel* model = BROTLI_ALLOC(m, ZopfliCostModel, 1);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(model) || BROTLI_IS_NULL(matches)) {
return 0;
}
nodes[0].length = 0;
nodes[0].u.cost = 0;
InitZopfliCostModel(m, model, &params->dist, num_bytes);
if (BROTLI_IS_OOM(m)) return 0;
ZopfliCostModelSetFromLiteralCosts(
model, position, ringbuffer, ringbuffer_mask);
InitStartPosQueue(&queue);
for (i = 0; i + HashTypeLengthH10() - 1 < num_bytes; i++) {
const size_t pos = position + i;
const size_t max_distance = BROTLI_MIN(size_t, pos, max_backward_limit);
const size_t dictionary_start = BROTLI_MIN(size_t,
pos + stream_offset, max_backward_limit);
size_t skip;
size_t num_matches;
int dict_id = 0;
if (params->dictionary.contextual.context_based) {
uint8_t p1 = pos >= 1 ?
ringbuffer[(size_t)(pos - 1) & ringbuffer_mask] : 0;
uint8_t p2 = pos >= 2 ?
ringbuffer[(size_t)(pos - 2) & ringbuffer_mask] : 0;
dict_id = params->dictionary.contextual.context_map[
BROTLI_CONTEXT(p1, p2, literal_context_lut)];
}
num_matches = FindAllMatchesH10(&hasher->privat._H10,
params->dictionary.contextual.dict[dict_id],
ringbuffer, ringbuffer_mask, pos, num_bytes - i, max_distance,
dictionary_start + gap, params, &matches[lz_matches_offset]);
if (addon->num_chunks != 0) {
size_t cd_matches = LookupAllCompoundDictionaryMatches(addon,
ringbuffer, ringbuffer_mask, pos, 3, num_bytes - i,
dictionary_start, params->dist.max_distance,
&matches[lz_matches_offset - 64], 64);
MergeMatches(matches, &matches[lz_matches_offset - 64], cd_matches,
&matches[lz_matches_offset], num_matches);
num_matches += cd_matches;
}
if (num_matches > 0 &&
BackwardMatchLength(&matches[num_matches - 1]) > max_zopfli_len) {
matches[0] = matches[num_matches - 1];
num_matches = 1;
}
skip = UpdateNodes(num_bytes, position, i, ringbuffer, ringbuffer_mask,
params, max_backward_limit, dist_cache, num_matches, matches, model,
&queue, nodes);
if (skip < BROTLI_LONG_COPY_QUICK_STEP) skip = 0;
if (num_matches == 1 && BackwardMatchLength(&matches[0]) > max_zopfli_len) {
skip = BROTLI_MAX(size_t, BackwardMatchLength(&matches[0]), skip);
}
if (skip > 1) {
/* Add the tail of the copy to the hasher. */
StoreRangeH10(&hasher->privat._H10,
ringbuffer, ringbuffer_mask, pos + 1, BROTLI_MIN(
size_t, pos + skip, store_end));
skip--;
while (skip) {
i++;
if (i + HashTypeLengthH10() - 1 >= num_bytes) break;
EvaluateNode(position + stream_offset, i, max_backward_limit, gap,
dist_cache, model, &queue, nodes);
skip--;
}
}
}
CleanupZopfliCostModel(m, model);
BROTLI_FREE(m, model);
BROTLI_FREE(m, matches);
return ComputeShortestPathFromNodes(num_bytes, nodes);
}
void BrotliCreateZopfliBackwardReferences(MemoryManager* m, size_t num_bytes,
size_t position, const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
Hasher* hasher, int* dist_cache, size_t* last_insert_len,
Command* commands, size_t* num_commands, size_t* num_literals) {
ZopfliNode* nodes = BROTLI_ALLOC(m, ZopfliNode, num_bytes + 1);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(nodes)) return;
BrotliInitZopfliNodes(nodes, num_bytes + 1);
*num_commands += BrotliZopfliComputeShortestPath(m, num_bytes,
position, ringbuffer, ringbuffer_mask, literal_context_lut, params,
dist_cache, hasher, nodes);
if (BROTLI_IS_OOM(m)) return;
BrotliZopfliCreateCommands(num_bytes, position, nodes, dist_cache,
last_insert_len, params, commands, num_literals);
BROTLI_FREE(m, nodes);
}
void BrotliCreateHqZopfliBackwardReferences(MemoryManager* m, size_t num_bytes,
size_t position, const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
Hasher* hasher, int* dist_cache, size_t* last_insert_len,
Command* commands, size_t* num_commands, size_t* num_literals) {
const size_t stream_offset = params->stream_offset;
const size_t max_backward_limit = BROTLI_MAX_BACKWARD_LIMIT(params->lgwin);
uint32_t* num_matches = BROTLI_ALLOC(m, uint32_t, num_bytes);
size_t matches_size = 4 * num_bytes;
const size_t store_end = num_bytes >= StoreLookaheadH10() ?
position + num_bytes - StoreLookaheadH10() + 1 : position;
size_t cur_match_pos = 0;
size_t i;
size_t orig_num_literals;
size_t orig_last_insert_len;
int orig_dist_cache[4];
size_t orig_num_commands;
ZopfliCostModel* model = BROTLI_ALLOC(m, ZopfliCostModel, 1);
ZopfliNode* nodes;
BackwardMatch* matches = BROTLI_ALLOC(m, BackwardMatch, matches_size);
const CompoundDictionary* addon = &params->dictionary.compound;
size_t gap = addon->total_size;
size_t shadow_matches =
(addon->num_chunks != 0) ? (MAX_NUM_MATCHES_H10 + 128) : 0;
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(model) ||
BROTLI_IS_NULL(num_matches) || BROTLI_IS_NULL(matches)) {
return;
}
for (i = 0; i + HashTypeLengthH10() - 1 < num_bytes; ++i) {
const size_t pos = position + i;
size_t max_distance = BROTLI_MIN(size_t, pos, max_backward_limit);
size_t dictionary_start = BROTLI_MIN(size_t,
pos + stream_offset, max_backward_limit);
size_t max_length = num_bytes - i;
size_t num_found_matches;
size_t cur_match_end;
size_t j;
int dict_id = 0;
if (params->dictionary.contextual.context_based) {
uint8_t p1 = pos >= 1 ?
ringbuffer[(size_t)(pos - 1) & ringbuffer_mask] : 0;
uint8_t p2 = pos >= 2 ?
ringbuffer[(size_t)(pos - 2) & ringbuffer_mask] : 0;
dict_id = params->dictionary.contextual.context_map[
BROTLI_CONTEXT(p1, p2, literal_context_lut)];
}
/* Ensure that we have enough free slots. */
BROTLI_ENSURE_CAPACITY(m, BackwardMatch, matches, matches_size,
cur_match_pos + MAX_NUM_MATCHES_H10 + shadow_matches);
if (BROTLI_IS_OOM(m)) return;
num_found_matches = FindAllMatchesH10(&hasher->privat._H10,
params->dictionary.contextual.dict[dict_id],
ringbuffer, ringbuffer_mask, pos, max_length,
max_distance, dictionary_start + gap, params,
&matches[cur_match_pos + shadow_matches]);
if (addon->num_chunks != 0) {
size_t cd_matches = LookupAllCompoundDictionaryMatches(addon,
ringbuffer, ringbuffer_mask, pos, 3, max_length,
dictionary_start, params->dist.max_distance,
&matches[cur_match_pos + shadow_matches - 64], 64);
MergeMatches(&matches[cur_match_pos],
&matches[cur_match_pos + shadow_matches - 64], cd_matches,
&matches[cur_match_pos + shadow_matches], num_found_matches);
num_found_matches += cd_matches;
}
cur_match_end = cur_match_pos + num_found_matches;
for (j = cur_match_pos; j + 1 < cur_match_end; ++j) {
BROTLI_DCHECK(BackwardMatchLength(&matches[j]) <=
BackwardMatchLength(&matches[j + 1]));
}
num_matches[i] = (uint32_t)num_found_matches;
if (num_found_matches > 0) {
const size_t match_len = BackwardMatchLength(&matches[cur_match_end - 1]);
if (match_len > MAX_ZOPFLI_LEN_QUALITY_11) {
const size_t skip = match_len - 1;
matches[cur_match_pos++] = matches[cur_match_end - 1];
num_matches[i] = 1;
/* Add the tail of the copy to the hasher. */
StoreRangeH10(&hasher->privat._H10,
ringbuffer, ringbuffer_mask, pos + 1,
BROTLI_MIN(size_t, pos + match_len, store_end));
memset(&num_matches[i + 1], 0, skip * sizeof(num_matches[0]));
i += skip;
} else {
cur_match_pos = cur_match_end;
}
}
}
orig_num_literals = *num_literals;
orig_last_insert_len = *last_insert_len;
memcpy(orig_dist_cache, dist_cache, 4 * sizeof(dist_cache[0]));
orig_num_commands = *num_commands;
nodes = BROTLI_ALLOC(m, ZopfliNode, num_bytes + 1);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(nodes)) return;
InitZopfliCostModel(m, model, &params->dist, num_bytes);
if (BROTLI_IS_OOM(m)) return;
for (i = 0; i < 2; i++) {
BrotliInitZopfliNodes(nodes, num_bytes + 1);
if (i == 0) {
ZopfliCostModelSetFromLiteralCosts(
model, position, ringbuffer, ringbuffer_mask);
} else {
ZopfliCostModelSetFromCommands(model, position, ringbuffer,
ringbuffer_mask, commands, *num_commands - orig_num_commands,
orig_last_insert_len);
}
*num_commands = orig_num_commands;
*num_literals = orig_num_literals;
*last_insert_len = orig_last_insert_len;
memcpy(dist_cache, orig_dist_cache, 4 * sizeof(dist_cache[0]));
*num_commands += ZopfliIterate(num_bytes, position, ringbuffer,
ringbuffer_mask, params, gap, dist_cache, model, num_matches, matches,
nodes);
BrotliZopfliCreateCommands(num_bytes, position, nodes, dist_cache,
last_insert_len, params, commands, num_literals);
}
CleanupZopfliCostModel(m, model);
BROTLI_FREE(m, model);
BROTLI_FREE(m, nodes);
BROTLI_FREE(m, matches);
BROTLI_FREE(m, num_matches);
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

View File

@@ -0,0 +1,92 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function to find backward reference copies. */
#ifndef BROTLI_ENC_BACKWARD_REFERENCES_HQ_H_
#define BROTLI_ENC_BACKWARD_REFERENCES_HQ_H_
#include "../common/context.h"
#include "../common/platform.h"
#include "command.h"
#include "hash.h"
#include "memory.h"
#include "params.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
BROTLI_INTERNAL void BrotliCreateZopfliBackwardReferences(MemoryManager* m,
size_t num_bytes,
size_t position, const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
Hasher* hasher, int* dist_cache, size_t* last_insert_len,
Command* commands, size_t* num_commands, size_t* num_literals);
BROTLI_INTERNAL void BrotliCreateHqZopfliBackwardReferences(MemoryManager* m,
size_t num_bytes,
size_t position, const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
Hasher* hasher, int* dist_cache, size_t* last_insert_len,
Command* commands, size_t* num_commands, size_t* num_literals);
typedef struct ZopfliNode {
/* Best length to get up to this byte (not including this byte itself)
highest 7 bit is used to reconstruct the length code. */
uint32_t length;
/* Distance associated with the length. */
uint32_t distance;
/* Number of literal inserts before this copy; highest 5 bits contain
distance short code + 1 (or zero if no short code). */
uint32_t dcode_insert_length;
/* This union holds information used by dynamic-programming. During forward
pass |cost| it used to store the goal function. When node is processed its
|cost| is invalidated in favor of |shortcut|. On path back-tracing pass
|next| is assigned the offset to next node on the path. */
union {
/* Smallest cost to get to this byte from the beginning, as found so far. */
float cost;
/* Offset to the next node on the path. Equals to command_length() of the
next node on the path. For last node equals to BROTLI_UINT32_MAX */
uint32_t next;
/* Node position that provides next distance for distance cache. */
uint32_t shortcut;
} u;
} ZopfliNode;
BROTLI_INTERNAL void BrotliInitZopfliNodes(ZopfliNode* array, size_t length);
/* Computes the shortest path of commands from position to at most
position + num_bytes.
On return, path->size() is the number of commands found and path[i] is the
length of the i-th command (copy length plus insert length).
Note that the sum of the lengths of all commands can be less than num_bytes.
On return, the nodes[0..num_bytes] array will have the following
"ZopfliNode array invariant":
For each i in [1..num_bytes], if nodes[i].cost < kInfinity, then
(1) nodes[i].copy_length() >= 2
(2) nodes[i].command_length() <= i and
(3) nodes[i - nodes[i].command_length()].cost < kInfinity */
BROTLI_INTERNAL size_t BrotliZopfliComputeShortestPath(
MemoryManager* m, size_t num_bytes,
size_t position, const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
const int* dist_cache, Hasher* hasher, ZopfliNode* nodes);
BROTLI_INTERNAL void BrotliZopfliCreateCommands(
const size_t num_bytes, const size_t block_start, const ZopfliNode* nodes,
int* dist_cache, size_t* last_insert_len, const BrotliEncoderParams* params,
Command* commands, size_t* num_literals);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_BACKWARD_REFERENCES_HQ_H_ */

View File

@@ -0,0 +1,189 @@
/* NOLINT(build/header_guard) */
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: EXPORT_FN, FN */
static BROTLI_NOINLINE void EXPORT_FN(CreateBackwardReferences)(
size_t num_bytes, size_t position,
const uint8_t* ringbuffer, size_t ringbuffer_mask,
ContextLut literal_context_lut, const BrotliEncoderParams* params,
Hasher* hasher, int* dist_cache, size_t* last_insert_len,
Command* commands, size_t* num_commands, size_t* num_literals) {
HASHER()* privat = &hasher->privat.FN(_);
/* Set maximum distance, see section 9.1. of the spec. */
const size_t max_backward_limit = BROTLI_MAX_BACKWARD_LIMIT(params->lgwin);
const size_t position_offset = params->stream_offset;
const Command* const orig_commands = commands;
size_t insert_length = *last_insert_len;
const size_t pos_end = position + num_bytes;
const size_t store_end = num_bytes >= FN(StoreLookahead)() ?
position + num_bytes - FN(StoreLookahead)() + 1 : position;
/* For speed up heuristics for random data. */
const size_t random_heuristics_window_size =
LiteralSpreeLengthForSparseSearch(params);
size_t apply_random_heuristics = position + random_heuristics_window_size;
const size_t gap = params->dictionary.compound.total_size;
/* Minimum score to accept a backward reference. */
const score_t kMinScore = BROTLI_SCORE_BASE + 100;
FN(PrepareDistanceCache)(privat, dist_cache);
while (position + FN(HashTypeLength)() < pos_end) {
size_t max_length = pos_end - position;
size_t max_distance = BROTLI_MIN(size_t, position, max_backward_limit);
size_t dictionary_start = BROTLI_MIN(size_t,
position + position_offset, max_backward_limit);
HasherSearchResult sr;
int dict_id = 0;
uint8_t p1 = 0;
uint8_t p2 = 0;
if (params->dictionary.contextual.context_based) {
p1 = position >= 1 ?
ringbuffer[(size_t)(position - 1) & ringbuffer_mask] : 0;
p2 = position >= 2 ?
ringbuffer[(size_t)(position - 2) & ringbuffer_mask] : 0;
dict_id = params->dictionary.contextual.context_map[
BROTLI_CONTEXT(p1, p2, literal_context_lut)];
}
sr.len = 0;
sr.len_code_delta = 0;
sr.distance = 0;
sr.score = kMinScore;
FN(FindLongestMatch)(privat, params->dictionary.contextual.dict[dict_id],
ringbuffer, ringbuffer_mask, dist_cache, position, max_length,
max_distance, dictionary_start + gap, params->dist.max_distance, &sr);
if (ENABLE_COMPOUND_DICTIONARY) {
LookupCompoundDictionaryMatch(&params->dictionary.compound, ringbuffer,
ringbuffer_mask, dist_cache, position, max_length,
dictionary_start, params->dist.max_distance, &sr);
}
if (sr.score > kMinScore) {
/* Found a match. Let's look for something even better ahead. */
int delayed_backward_references_in_row = 0;
--max_length;
for (;; --max_length) {
const score_t cost_diff_lazy = 175;
HasherSearchResult sr2;
sr2.len = params->quality < MIN_QUALITY_FOR_EXTENSIVE_REFERENCE_SEARCH ?
BROTLI_MIN(size_t, sr.len - 1, max_length) : 0;
sr2.len_code_delta = 0;
sr2.distance = 0;
sr2.score = kMinScore;
max_distance = BROTLI_MIN(size_t, position + 1, max_backward_limit);
dictionary_start = BROTLI_MIN(size_t,
position + 1 + position_offset, max_backward_limit);
if (params->dictionary.contextual.context_based) {
p2 = p1;
p1 = ringbuffer[position & ringbuffer_mask];
dict_id = params->dictionary.contextual.context_map[
BROTLI_CONTEXT(p1, p2, literal_context_lut)];
}
FN(FindLongestMatch)(privat,
params->dictionary.contextual.dict[dict_id],
ringbuffer, ringbuffer_mask, dist_cache, position + 1, max_length,
max_distance, dictionary_start + gap, params->dist.max_distance,
&sr2);
if (ENABLE_COMPOUND_DICTIONARY) {
LookupCompoundDictionaryMatch(
&params->dictionary.compound, ringbuffer,
ringbuffer_mask, dist_cache, position + 1, max_length,
dictionary_start, params->dist.max_distance, &sr2);
}
if (sr2.score >= sr.score + cost_diff_lazy) {
/* Ok, let's just write one byte for now and start a match from the
next byte. */
++position;
++insert_length;
sr = sr2;
if (++delayed_backward_references_in_row < 4 &&
position + FN(HashTypeLength)() < pos_end) {
continue;
}
}
break;
}
apply_random_heuristics =
position + 2 * sr.len + random_heuristics_window_size;
dictionary_start = BROTLI_MIN(size_t,
position + position_offset, max_backward_limit);
{
/* The first 16 codes are special short-codes,
and the minimum offset is 1. */
size_t distance_code = ComputeDistanceCode(
sr.distance, dictionary_start + gap, dist_cache);
if ((sr.distance <= (dictionary_start + gap)) && distance_code > 0) {
dist_cache[3] = dist_cache[2];
dist_cache[2] = dist_cache[1];
dist_cache[1] = dist_cache[0];
dist_cache[0] = (int)sr.distance;
FN(PrepareDistanceCache)(privat, dist_cache);
}
InitCommand(commands++, &params->dist, insert_length,
sr.len, sr.len_code_delta, distance_code);
}
*num_literals += insert_length;
insert_length = 0;
/* Put the hash keys into the table, if there are enough bytes left.
Depending on the hasher implementation, it can push all positions
in the given range or only a subset of them.
Avoid hash poisoning with RLE data. */
{
size_t range_start = position + 2;
size_t range_end = BROTLI_MIN(size_t, position + sr.len, store_end);
if (sr.distance < (sr.len >> 2)) {
range_start = BROTLI_MIN(size_t, range_end, BROTLI_MAX(size_t,
range_start, position + sr.len - (sr.distance << 2)));
}
FN(StoreRange)(privat, ringbuffer, ringbuffer_mask, range_start,
range_end);
}
position += sr.len;
} else {
++insert_length;
++position;
/* If we have not seen matches for a long time, we can skip some
match lookups. Unsuccessful match lookups are very very expensive
and this kind of a heuristic speeds up compression quite
a lot. */
if (position > apply_random_heuristics) {
/* Going through uncompressible data, jump. */
if (position >
apply_random_heuristics + 4 * random_heuristics_window_size) {
/* It is quite a long time since we saw a copy, so we assume
that this data is not compressible, and store hashes less
often. Hashes of non compressible data are less likely to
turn out to be useful in the future, too, so we store less of
them to not to flood out the hash table of good compressible
data. */
const size_t kMargin =
BROTLI_MAX(size_t, FN(StoreLookahead)() - 1, 4);
size_t pos_jump =
BROTLI_MIN(size_t, position + 16, pos_end - kMargin);
for (; position < pos_jump; position += 4) {
FN(Store)(privat, ringbuffer, ringbuffer_mask, position);
insert_length += 4;
}
} else {
const size_t kMargin =
BROTLI_MAX(size_t, FN(StoreLookahead)() - 1, 2);
size_t pos_jump =
BROTLI_MIN(size_t, position + 8, pos_end - kMargin);
for (; position < pos_jump; position += 2) {
FN(Store)(privat, ringbuffer, ringbuffer_mask, position);
insert_length += 2;
}
}
}
}
}
insert_length += pos_end - position;
*last_insert_len = insert_length;
*num_commands += (size_t)(commands - orig_commands);
}

60
c/enc/bit_cost.c Normal file
View File

@@ -0,0 +1,60 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Functions to estimate the bit cost of Huffman trees. */
#include "bit_cost.h"
#include "../common/platform.h"
#include "fast_log.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
double BrotliBitsEntropy(const uint32_t* population, size_t size) {
size_t sum = 0;
double retval = 0;
const uint32_t* population_end = population + size;
size_t p;
if (size & 1) {
goto odd_number_of_elements_left;
}
while (population < population_end) {
p = *population++;
sum += p;
retval -= (double)p * FastLog2(p);
odd_number_of_elements_left:
p = *population++;
sum += p;
retval -= (double)p * FastLog2(p);
}
if (sum) retval += (double)sum * FastLog2(sum);
if (retval < (double)sum) {
/* TODO(eustas): consider doing that per-symbol? */
/* At least one bit per literal is needed. */
retval = (double)sum;
}
return retval;
}
#define FN(X) X ## Literal
#include "bit_cost_inc.h" /* NOLINT(build/include) */
#undef FN
#define FN(X) X ## Command
#include "bit_cost_inc.h" /* NOLINT(build/include) */
#undef FN
#define FN(X) X ## Distance
#include "bit_cost_inc.h" /* NOLINT(build/include) */
#undef FN
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

32
c/enc/bit_cost.h Normal file
View File

@@ -0,0 +1,32 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Functions to estimate the bit cost of Huffman trees. */
#ifndef BROTLI_ENC_BIT_COST_H_
#define BROTLI_ENC_BIT_COST_H_
#include "../common/platform.h"
#include "histogram.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
BROTLI_INTERNAL double BrotliBitsEntropy(
const uint32_t* population, size_t size);
BROTLI_INTERNAL double BrotliPopulationCostLiteral(
const HistogramLiteral* histogram);
BROTLI_INTERNAL double BrotliPopulationCostCommand(
const HistogramCommand* histogram);
BROTLI_INTERNAL double BrotliPopulationCostDistance(
const HistogramDistance* histogram);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_BIT_COST_H_ */

127
c/enc/bit_cost_inc.h Normal file
View File

@@ -0,0 +1,127 @@
/* NOLINT(build/header_guard) */
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN */
#define HistogramType FN(Histogram)
double FN(BrotliPopulationCost)(const HistogramType* histogram) {
static const double kOneSymbolHistogramCost = 12;
static const double kTwoSymbolHistogramCost = 20;
static const double kThreeSymbolHistogramCost = 28;
static const double kFourSymbolHistogramCost = 37;
const size_t data_size = FN(HistogramDataSize)();
int count = 0;
size_t s[5];
double bits = 0.0;
size_t i;
if (histogram->total_count_ == 0) {
return kOneSymbolHistogramCost;
}
for (i = 0; i < data_size; ++i) {
if (histogram->data_[i] > 0) {
s[count] = i;
++count;
if (count > 4) break;
}
}
if (count == 1) {
return kOneSymbolHistogramCost;
}
if (count == 2) {
return (kTwoSymbolHistogramCost + (double)histogram->total_count_);
}
if (count == 3) {
const uint32_t histo0 = histogram->data_[s[0]];
const uint32_t histo1 = histogram->data_[s[1]];
const uint32_t histo2 = histogram->data_[s[2]];
const uint32_t histomax =
BROTLI_MAX(uint32_t, histo0, BROTLI_MAX(uint32_t, histo1, histo2));
return (kThreeSymbolHistogramCost +
2 * (histo0 + histo1 + histo2) - histomax);
}
if (count == 4) {
uint32_t histo[4];
uint32_t h23;
uint32_t histomax;
for (i = 0; i < 4; ++i) {
histo[i] = histogram->data_[s[i]];
}
/* Sort */
for (i = 0; i < 4; ++i) {
size_t j;
for (j = i + 1; j < 4; ++j) {
if (histo[j] > histo[i]) {
BROTLI_SWAP(uint32_t, histo, j, i);
}
}
}
h23 = histo[2] + histo[3];
histomax = BROTLI_MAX(uint32_t, h23, histo[0]);
return (kFourSymbolHistogramCost +
3 * h23 + 2 * (histo[0] + histo[1]) - histomax);
}
{
/* In this loop we compute the entropy of the histogram and simultaneously
build a simplified histogram of the code length codes where we use the
zero repeat code 17, but we don't use the non-zero repeat code 16. */
size_t max_depth = 1;
uint32_t depth_histo[BROTLI_CODE_LENGTH_CODES] = { 0 };
const double log2total = FastLog2(histogram->total_count_);
for (i = 0; i < data_size;) {
if (histogram->data_[i] > 0) {
/* Compute -log2(P(symbol)) = -log2(count(symbol)/total_count) =
= log2(total_count) - log2(count(symbol)) */
double log2p = log2total - FastLog2(histogram->data_[i]);
/* Approximate the bit depth by round(-log2(P(symbol))) */
size_t depth = (size_t)(log2p + 0.5);
bits += histogram->data_[i] * log2p;
if (depth > 15) {
depth = 15;
}
if (depth > max_depth) {
max_depth = depth;
}
++depth_histo[depth];
++i;
} else {
/* Compute the run length of zeros and add the appropriate number of 0
and 17 code length codes to the code length code histogram. */
uint32_t reps = 1;
size_t k;
for (k = i + 1; k < data_size && histogram->data_[k] == 0; ++k) {
++reps;
}
i += reps;
if (i == data_size) {
/* Don't add any cost for the last zero run, since these are encoded
only implicitly. */
break;
}
if (reps < 3) {
depth_histo[0] += reps;
} else {
reps -= 2;
while (reps > 0) {
++depth_histo[BROTLI_REPEAT_ZERO_CODE_LENGTH];
/* Add the 3 extra bits for the 17 code length code. */
bits += 3;
reps >>= 3;
}
}
}
}
/* Add the estimated encoding cost of the code length code histogram. */
bits += (double)(18 + 2 * max_depth);
/* Add the entropy of the code length code histogram. */
bits += BrotliBitsEntropy(depth_histo, BROTLI_CODE_LENGTH_CODES);
}
return bits;
}
#undef HistogramType

34
c/enc/block_encoder_inc.h Normal file
View File

@@ -0,0 +1,34 @@
/* NOLINT(build/header_guard) */
/* Copyright 2014 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN */
#define HistogramType FN(Histogram)
/* Creates entropy codes for all block types and stores them to the bit
stream. */
static void FN(BuildAndStoreEntropyCodes)(MemoryManager* m, BlockEncoder* self,
const HistogramType* histograms, const size_t histograms_size,
const size_t alphabet_size, HuffmanTree* tree,
size_t* storage_ix, uint8_t* storage) {
const size_t table_size = histograms_size * self->histogram_length_;
self->depths_ = BROTLI_ALLOC(m, uint8_t, table_size);
self->bits_ = BROTLI_ALLOC(m, uint16_t, table_size);
if (BROTLI_IS_OOM(m)) return;
{
size_t i;
for (i = 0; i < histograms_size; ++i) {
size_t ix = i * self->histogram_length_;
BuildAndStoreHuffmanTree(&histograms[i].data_[0], self->histogram_length_,
alphabet_size, tree, &self->depths_[ix], &self->bits_[ix],
storage_ix, storage);
}
}
}
#undef HistogramType

214
c/enc/block_splitter.c Normal file
View File

@@ -0,0 +1,214 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Block split point selection utilities. */
#include "block_splitter.h"
#include "../common/platform.h"
#include "bit_cost.h"
#include "cluster.h"
#include "command.h"
#include "fast_log.h"
#include "histogram.h"
#include "memory.h"
#include "quality.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
static const size_t kMaxLiteralHistograms = 100;
static const size_t kMaxCommandHistograms = 50;
static const double kLiteralBlockSwitchCost = 28.1;
static const double kCommandBlockSwitchCost = 13.5;
static const double kDistanceBlockSwitchCost = 14.6;
static const size_t kLiteralStrideLength = 70;
static const size_t kCommandStrideLength = 40;
static const size_t kDistanceStrideLength = 40;
static const size_t kSymbolsPerLiteralHistogram = 544;
static const size_t kSymbolsPerCommandHistogram = 530;
static const size_t kSymbolsPerDistanceHistogram = 544;
static const size_t kMinLengthForBlockSplitting = 128;
static const size_t kIterMulForRefining = 2;
static const size_t kMinItersForRefining = 100;
static size_t CountLiterals(const Command* cmds, const size_t num_commands) {
/* Count how many we have. */
size_t total_length = 0;
size_t i;
for (i = 0; i < num_commands; ++i) {
total_length += cmds[i].insert_len_;
}
return total_length;
}
static void CopyLiteralsToByteArray(const Command* cmds,
const size_t num_commands,
const uint8_t* data,
const size_t offset,
const size_t mask,
uint8_t* literals) {
size_t pos = 0;
size_t from_pos = offset & mask;
size_t i;
for (i = 0; i < num_commands; ++i) {
size_t insert_len = cmds[i].insert_len_;
if (from_pos + insert_len > mask) {
size_t head_size = mask + 1 - from_pos;
memcpy(literals + pos, data + from_pos, head_size);
from_pos = 0;
pos += head_size;
insert_len -= head_size;
}
if (insert_len > 0) {
memcpy(literals + pos, data + from_pos, insert_len);
pos += insert_len;
}
from_pos = (from_pos + insert_len + CommandCopyLen(&cmds[i])) & mask;
}
}
static BROTLI_INLINE uint32_t MyRand(uint32_t* seed) {
/* Initial seed should be 7. In this case, loop length is (1 << 29). */
*seed *= 16807U;
return *seed;
}
static BROTLI_INLINE double BitCost(size_t count) {
return count == 0 ? -2.0 : FastLog2(count);
}
#define HISTOGRAMS_PER_BATCH 64
#define CLUSTERS_PER_BATCH 16
#define FN(X) X ## Literal
#define DataType uint8_t
/* NOLINTNEXTLINE(build/include) */
#include "block_splitter_inc.h"
#undef DataType
#undef FN
#define FN(X) X ## Command
#define DataType uint16_t
/* NOLINTNEXTLINE(build/include) */
#include "block_splitter_inc.h"
#undef FN
#define FN(X) X ## Distance
/* NOLINTNEXTLINE(build/include) */
#include "block_splitter_inc.h"
#undef DataType
#undef FN
void BrotliInitBlockSplit(BlockSplit* self) {
self->num_types = 0;
self->num_blocks = 0;
self->types = 0;
self->lengths = 0;
self->types_alloc_size = 0;
self->lengths_alloc_size = 0;
}
void BrotliDestroyBlockSplit(MemoryManager* m, BlockSplit* self) {
BROTLI_FREE(m, self->types);
BROTLI_FREE(m, self->lengths);
}
/* Extracts literals, command distance and prefix codes, then applies
* SplitByteVector to create partitioning. */
void BrotliSplitBlock(MemoryManager* m,
const Command* cmds,
const size_t num_commands,
const uint8_t* data,
const size_t pos,
const size_t mask,
const BrotliEncoderParams* params,
BlockSplit* literal_split,
BlockSplit* insert_and_copy_split,
BlockSplit* dist_split) {
{
size_t literals_count = CountLiterals(cmds, num_commands);
uint8_t* literals = BROTLI_ALLOC(m, uint8_t, literals_count);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(literals)) return;
/* Create a continuous array of literals. */
CopyLiteralsToByteArray(cmds, num_commands, data, pos, mask, literals);
/* Create the block split on the array of literals.
* Literal histograms can have alphabet size up to 256.
* Though, to accommodate context modeling, less than half of maximum size
* is allowed. */
SplitByteVectorLiteral(
m, literals, literals_count,
kSymbolsPerLiteralHistogram, kMaxLiteralHistograms,
kLiteralStrideLength, kLiteralBlockSwitchCost, params,
literal_split);
if (BROTLI_IS_OOM(m)) return;
BROTLI_FREE(m, literals);
/* NB: this might be a good place for injecting extra splitting without
* increasing encoder complexity; however, output partition would be less
* optimal than one produced with forced splitting inside
* SplitByteVector (FindBlocks / ClusterBlocks). */
}
{
/* Compute prefix codes for commands. */
uint16_t* insert_and_copy_codes = BROTLI_ALLOC(m, uint16_t, num_commands);
size_t i;
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(insert_and_copy_codes)) return;
for (i = 0; i < num_commands; ++i) {
insert_and_copy_codes[i] = cmds[i].cmd_prefix_;
}
/* Create the block split on the array of command prefixes. */
SplitByteVectorCommand(
m, insert_and_copy_codes, num_commands,
kSymbolsPerCommandHistogram, kMaxCommandHistograms,
kCommandStrideLength, kCommandBlockSwitchCost, params,
insert_and_copy_split);
if (BROTLI_IS_OOM(m)) return;
/* TODO(eustas): reuse for distances? */
BROTLI_FREE(m, insert_and_copy_codes);
}
{
/* Create a continuous array of distance prefixes. */
uint16_t* distance_prefixes = BROTLI_ALLOC(m, uint16_t, num_commands);
size_t j = 0;
size_t i;
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(distance_prefixes)) return;
for (i = 0; i < num_commands; ++i) {
const Command* cmd = &cmds[i];
if (CommandCopyLen(cmd) && cmd->cmd_prefix_ >= 128) {
distance_prefixes[j++] = cmd->dist_prefix_ & 0x3FF;
}
}
/* Create the block split on the array of distance prefixes. */
SplitByteVectorDistance(
m, distance_prefixes, j,
kSymbolsPerDistanceHistogram, kMaxCommandHistograms,
kDistanceStrideLength, kDistanceBlockSwitchCost, params,
dist_split);
if (BROTLI_IS_OOM(m)) return;
BROTLI_FREE(m, distance_prefixes);
}
}
#if defined(BROTLI_TEST)
size_t BrotliCountLiteralsForTest(const Command*, size_t);
size_t BrotliCountLiteralsForTest(const Command* cmds, size_t num_commands) {
return CountLiterals(cmds, num_commands);
}
void BrotliCopyLiteralsToByteArrayForTest(
const Command*, size_t, const uint8_t*, size_t, size_t, uint8_t*);
void BrotliCopyLiteralsToByteArrayForTest(const Command* cmds,
size_t num_commands, const uint8_t* data, size_t offset, size_t mask,
uint8_t* literals) {
CopyLiteralsToByteArray(cmds, num_commands, data, offset, mask, literals);
}
#endif
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

49
c/enc/block_splitter.h Normal file
View File

@@ -0,0 +1,49 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Block split point selection utilities. */
#ifndef BROTLI_ENC_BLOCK_SPLITTER_H_
#define BROTLI_ENC_BLOCK_SPLITTER_H_
#include "../common/platform.h"
#include "command.h"
#include "memory.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
typedef struct BlockSplit {
size_t num_types; /* Amount of distinct types */
size_t num_blocks; /* Amount of values in types and length */
uint8_t* types;
uint32_t* lengths;
size_t types_alloc_size;
size_t lengths_alloc_size;
} BlockSplit;
BROTLI_INTERNAL void BrotliInitBlockSplit(BlockSplit* self);
BROTLI_INTERNAL void BrotliDestroyBlockSplit(MemoryManager* m,
BlockSplit* self);
BROTLI_INTERNAL void BrotliSplitBlock(MemoryManager* m,
const Command* cmds,
const size_t num_commands,
const uint8_t* data,
const size_t offset,
const size_t mask,
const BrotliEncoderParams* params,
BlockSplit* literal_split,
BlockSplit* insert_and_copy_split,
BlockSplit* dist_split);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_BLOCK_SPLITTER_H_ */

487
c/enc/block_splitter_inc.h Normal file
View File

@@ -0,0 +1,487 @@
/* NOLINT(build/header_guard) */
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN, DataType */
#define HistogramType FN(Histogram)
static void FN(InitialEntropyCodes)(const DataType* data, size_t length,
size_t stride,
size_t num_histograms,
HistogramType* histograms) {
uint32_t seed = 7;
size_t block_length = length / num_histograms;
size_t i;
FN(ClearHistograms)(histograms, num_histograms);
for (i = 0; i < num_histograms; ++i) {
size_t pos = length * i / num_histograms;
if (i != 0) {
pos += MyRand(&seed) % block_length;
}
if (pos + stride >= length) {
pos = length - stride - 1;
}
FN(HistogramAddVector)(&histograms[i], data + pos, stride);
}
}
static void FN(RandomSample)(uint32_t* seed,
const DataType* data,
size_t length,
size_t stride,
HistogramType* sample) {
size_t pos = 0;
if (stride >= length) {
stride = length;
} else {
pos = MyRand(seed) % (length - stride + 1);
}
FN(HistogramAddVector)(sample, data + pos, stride);
}
static void FN(RefineEntropyCodes)(const DataType* data, size_t length,
size_t stride,
size_t num_histograms,
HistogramType* histograms,
HistogramType* tmp) {
size_t iters =
kIterMulForRefining * length / stride + kMinItersForRefining;
uint32_t seed = 7;
size_t iter;
iters = ((iters + num_histograms - 1) / num_histograms) * num_histograms;
for (iter = 0; iter < iters; ++iter) {
FN(HistogramClear)(tmp);
FN(RandomSample)(&seed, data, length, stride, tmp);
FN(HistogramAddHistogram)(&histograms[iter % num_histograms], tmp);
}
}
/* Assigns a block id from the range [0, num_histograms) to each data element
in data[0..length) and fills in block_id[0..length) with the assigned values.
Returns the number of blocks, i.e. one plus the number of block switches. */
static size_t FN(FindBlocks)(const DataType* data, const size_t length,
const double block_switch_bitcost,
const size_t num_histograms,
const HistogramType* histograms,
double* insert_cost,
double* cost,
uint8_t* switch_signal,
uint8_t* block_id) {
const size_t alphabet_size = FN(HistogramDataSize)();
const size_t bitmap_len = (num_histograms + 7) >> 3;
size_t num_blocks = 1;
size_t byte_ix;
size_t i;
size_t j;
BROTLI_DCHECK(num_histograms <= 256);
/* Trivial case: single histogram -> single block type. */
if (num_histograms <= 1) {
for (i = 0; i < length; ++i) {
block_id[i] = 0;
}
return 1;
}
/* Fill bitcost for each symbol of all histograms.
* Non-existing symbol cost: 2 + log2(total_count).
* Regular symbol cost: -log2(symbol_count / total_count). */
memset(insert_cost, 0,
sizeof(insert_cost[0]) * alphabet_size * num_histograms);
for (i = 0; i < num_histograms; ++i) {
insert_cost[i] = FastLog2((uint32_t)histograms[i].total_count_);
}
for (i = alphabet_size; i != 0;) {
/* Reverse order to use the 0-th row as a temporary storage. */
--i;
for (j = 0; j < num_histograms; ++j) {
insert_cost[i * num_histograms + j] =
insert_cost[j] - BitCost(histograms[j].data_[i]);
}
}
/* After each iteration of this loop, cost[k] will contain the difference
between the minimum cost of arriving at the current byte position using
entropy code k, and the minimum cost of arriving at the current byte
position. This difference is capped at the block switch cost, and if it
reaches block switch cost, it means that when we trace back from the last
position, we need to switch here. */
memset(cost, 0, sizeof(cost[0]) * num_histograms);
memset(switch_signal, 0, sizeof(switch_signal[0]) * length * bitmap_len);
for (byte_ix = 0; byte_ix < length; ++byte_ix) {
size_t ix = byte_ix * bitmap_len;
size_t symbol = data[byte_ix];
size_t insert_cost_ix = symbol * num_histograms;
double min_cost = 1e99;
double block_switch_cost = block_switch_bitcost;
static const size_t prologue_length = 2000;
static const double multiplier = 0.07 / 2000;
size_t k;
for (k = 0; k < num_histograms; ++k) {
/* We are coding the symbol with entropy code k. */
cost[k] += insert_cost[insert_cost_ix + k];
if (cost[k] < min_cost) {
min_cost = cost[k];
block_id[byte_ix] = (uint8_t)k;
}
}
/* More blocks for the beginning. */
if (byte_ix < prologue_length) {
block_switch_cost *= 0.77 + multiplier * (double)byte_ix;
}
for (k = 0; k < num_histograms; ++k) {
cost[k] -= min_cost;
if (cost[k] >= block_switch_cost) {
const uint8_t mask = (uint8_t)(1u << (k & 7));
cost[k] = block_switch_cost;
BROTLI_DCHECK((k >> 3) < bitmap_len);
switch_signal[ix + (k >> 3)] |= mask;
}
}
}
byte_ix = length - 1;
{ /* Trace back from the last position and switch at the marked places. */
size_t ix = byte_ix * bitmap_len;
uint8_t cur_id = block_id[byte_ix];
while (byte_ix > 0) {
const uint8_t mask = (uint8_t)(1u << (cur_id & 7));
BROTLI_DCHECK(((size_t)cur_id >> 3) < bitmap_len);
--byte_ix;
ix -= bitmap_len;
if (switch_signal[ix + (cur_id >> 3)] & mask) {
if (cur_id != block_id[byte_ix]) {
cur_id = block_id[byte_ix];
++num_blocks;
}
}
block_id[byte_ix] = cur_id;
}
}
return num_blocks;
}
static size_t FN(RemapBlockIds)(uint8_t* block_ids, const size_t length,
uint16_t* new_id, const size_t num_histograms) {
static const uint16_t kInvalidId = 256;
uint16_t next_id = 0;
size_t i;
for (i = 0; i < num_histograms; ++i) {
new_id[i] = kInvalidId;
}
for (i = 0; i < length; ++i) {
BROTLI_DCHECK(block_ids[i] < num_histograms);
if (new_id[block_ids[i]] == kInvalidId) {
new_id[block_ids[i]] = next_id++;
}
}
for (i = 0; i < length; ++i) {
block_ids[i] = (uint8_t)new_id[block_ids[i]];
BROTLI_DCHECK(block_ids[i] < num_histograms);
}
BROTLI_DCHECK(next_id <= num_histograms);
return next_id;
}
static void FN(BuildBlockHistograms)(const DataType* data, const size_t length,
const uint8_t* block_ids,
const size_t num_histograms,
HistogramType* histograms) {
size_t i;
FN(ClearHistograms)(histograms, num_histograms);
for (i = 0; i < length; ++i) {
FN(HistogramAdd)(&histograms[block_ids[i]], data[i]);
}
}
/* Given the initial partitioning build partitioning with limited number
* of histograms (and block types). */
static void FN(ClusterBlocks)(MemoryManager* m,
const DataType* data, const size_t length,
const size_t num_blocks,
uint8_t* block_ids,
BlockSplit* split) {
uint32_t* histogram_symbols = BROTLI_ALLOC(m, uint32_t, num_blocks);
uint32_t* u32 =
BROTLI_ALLOC(m, uint32_t, num_blocks + 4 * HISTOGRAMS_PER_BATCH);
const size_t expected_num_clusters = CLUSTERS_PER_BATCH *
(num_blocks + HISTOGRAMS_PER_BATCH - 1) / HISTOGRAMS_PER_BATCH;
size_t all_histograms_size = 0;
size_t all_histograms_capacity = expected_num_clusters;
HistogramType* all_histograms =
BROTLI_ALLOC(m, HistogramType, all_histograms_capacity);
size_t cluster_size_size = 0;
size_t cluster_size_capacity = expected_num_clusters;
uint32_t* cluster_size = BROTLI_ALLOC(m, uint32_t, cluster_size_capacity);
size_t num_clusters = 0;
HistogramType* histograms = BROTLI_ALLOC(m, HistogramType,
BROTLI_MIN(size_t, num_blocks, HISTOGRAMS_PER_BATCH));
size_t max_num_pairs =
HISTOGRAMS_PER_BATCH * HISTOGRAMS_PER_BATCH / 2;
size_t pairs_capacity = max_num_pairs + 1;
HistogramPair* pairs = BROTLI_ALLOC(m, HistogramPair, pairs_capacity);
size_t pos = 0;
uint32_t* clusters;
size_t num_final_clusters;
static const uint32_t kInvalidIndex = BROTLI_UINT32_MAX;
uint32_t* new_index;
size_t i;
uint32_t* BROTLI_RESTRICT const sizes =
u32 ? (u32 + 0 * HISTOGRAMS_PER_BATCH) : NULL;
uint32_t* BROTLI_RESTRICT const new_clusters =
u32 ? (u32 + 1 * HISTOGRAMS_PER_BATCH) : NULL;
uint32_t* BROTLI_RESTRICT const symbols =
u32 ? (u32 + 2 * HISTOGRAMS_PER_BATCH) : NULL;
uint32_t* BROTLI_RESTRICT const remap =
u32 ? (u32 + 3 * HISTOGRAMS_PER_BATCH) : NULL;
uint32_t* BROTLI_RESTRICT const block_lengths =
u32 ? (u32 + 4 * HISTOGRAMS_PER_BATCH) : NULL;
/* TODO(eustas): move to arena? */
HistogramType* tmp = BROTLI_ALLOC(m, HistogramType, 2);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(histogram_symbols) ||
BROTLI_IS_NULL(u32) || BROTLI_IS_NULL(all_histograms) ||
BROTLI_IS_NULL(cluster_size) || BROTLI_IS_NULL(histograms) ||
BROTLI_IS_NULL(pairs) || BROTLI_IS_NULL(tmp)) {
return;
}
memset(u32, 0, (num_blocks + 4 * HISTOGRAMS_PER_BATCH) * sizeof(uint32_t));
/* Calculate block lengths (convert repeating values -> series length). */
{
size_t block_idx = 0;
for (i = 0; i < length; ++i) {
BROTLI_DCHECK(block_idx < num_blocks);
++block_lengths[block_idx];
if (i + 1 == length || block_ids[i] != block_ids[i + 1]) {
++block_idx;
}
}
BROTLI_DCHECK(block_idx == num_blocks);
}
/* Pre-cluster blocks (cluster batches). */
for (i = 0; i < num_blocks; i += HISTOGRAMS_PER_BATCH) {
const size_t num_to_combine =
BROTLI_MIN(size_t, num_blocks - i, HISTOGRAMS_PER_BATCH);
size_t num_new_clusters;
size_t j;
for (j = 0; j < num_to_combine; ++j) {
size_t k;
size_t block_length = block_lengths[i + j];
FN(HistogramClear)(&histograms[j]);
for (k = 0; k < block_length; ++k) {
FN(HistogramAdd)(&histograms[j], data[pos++]);
}
histograms[j].bit_cost_ = FN(BrotliPopulationCost)(&histograms[j]);
new_clusters[j] = (uint32_t)j;
symbols[j] = (uint32_t)j;
sizes[j] = 1;
}
num_new_clusters = FN(BrotliHistogramCombine)(
histograms, tmp, sizes, symbols, new_clusters, pairs, num_to_combine,
num_to_combine, HISTOGRAMS_PER_BATCH, max_num_pairs);
BROTLI_ENSURE_CAPACITY(m, HistogramType, all_histograms,
all_histograms_capacity, all_histograms_size + num_new_clusters);
BROTLI_ENSURE_CAPACITY(m, uint32_t, cluster_size,
cluster_size_capacity, cluster_size_size + num_new_clusters);
if (BROTLI_IS_OOM(m)) return;
for (j = 0; j < num_new_clusters; ++j) {
all_histograms[all_histograms_size++] = histograms[new_clusters[j]];
cluster_size[cluster_size_size++] = sizes[new_clusters[j]];
remap[new_clusters[j]] = (uint32_t)j;
}
for (j = 0; j < num_to_combine; ++j) {
histogram_symbols[i + j] = (uint32_t)num_clusters + remap[symbols[j]];
}
num_clusters += num_new_clusters;
BROTLI_DCHECK(num_clusters == cluster_size_size);
BROTLI_DCHECK(num_clusters == all_histograms_size);
}
BROTLI_FREE(m, histograms);
/* Final clustering. */
max_num_pairs =
BROTLI_MIN(size_t, 64 * num_clusters, (num_clusters / 2) * num_clusters);
if (pairs_capacity < max_num_pairs + 1) {
BROTLI_FREE(m, pairs);
pairs = BROTLI_ALLOC(m, HistogramPair, max_num_pairs + 1);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(pairs)) return;
}
clusters = BROTLI_ALLOC(m, uint32_t, num_clusters);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(clusters)) return;
for (i = 0; i < num_clusters; ++i) {
clusters[i] = (uint32_t)i;
}
num_final_clusters = FN(BrotliHistogramCombine)(
all_histograms, tmp, cluster_size, histogram_symbols, clusters, pairs,
num_clusters, num_blocks, BROTLI_MAX_NUMBER_OF_BLOCK_TYPES,
max_num_pairs);
BROTLI_FREE(m, pairs);
BROTLI_FREE(m, cluster_size);
/* Assign blocks to final histograms. */
new_index = BROTLI_ALLOC(m, uint32_t, num_clusters);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(new_index)) return;
for (i = 0; i < num_clusters; ++i) new_index[i] = kInvalidIndex;
pos = 0;
{
uint32_t next_index = 0;
for (i = 0; i < num_blocks; ++i) {
size_t j;
uint32_t best_out;
double best_bits;
FN(HistogramClear)(tmp);
for (j = 0; j < block_lengths[i]; ++j) {
FN(HistogramAdd)(tmp, data[pos++]);
}
/* Among equally good histograms prefer last used. */
/* TODO(eustas): should we give a block-switch discount here? */
best_out = (i == 0) ? histogram_symbols[0] : histogram_symbols[i - 1];
best_bits = FN(BrotliHistogramBitCostDistance)(
tmp, &all_histograms[best_out], tmp + 1);
for (j = 0; j < num_final_clusters; ++j) {
const double cur_bits = FN(BrotliHistogramBitCostDistance)(
tmp, &all_histograms[clusters[j]], tmp + 1);
if (cur_bits < best_bits) {
best_bits = cur_bits;
best_out = clusters[j];
}
}
histogram_symbols[i] = best_out;
if (new_index[best_out] == kInvalidIndex) {
new_index[best_out] = next_index++;
}
}
}
BROTLI_FREE(m, tmp);
BROTLI_FREE(m, clusters);
BROTLI_FREE(m, all_histograms);
BROTLI_ENSURE_CAPACITY(
m, uint8_t, split->types, split->types_alloc_size, num_blocks);
BROTLI_ENSURE_CAPACITY(
m, uint32_t, split->lengths, split->lengths_alloc_size, num_blocks);
if (BROTLI_IS_OOM(m)) return;
/* Rewrite final assignment to block-split. There might be less blocks
* than |num_blocks| due to clustering. */
{
uint32_t cur_length = 0;
size_t block_idx = 0;
uint8_t max_type = 0;
for (i = 0; i < num_blocks; ++i) {
cur_length += block_lengths[i];
if (i + 1 == num_blocks ||
histogram_symbols[i] != histogram_symbols[i + 1]) {
const uint8_t id = (uint8_t)new_index[histogram_symbols[i]];
split->types[block_idx] = id;
split->lengths[block_idx] = cur_length;
max_type = BROTLI_MAX(uint8_t, max_type, id);
cur_length = 0;
++block_idx;
}
}
split->num_blocks = block_idx;
split->num_types = (size_t)max_type + 1;
}
BROTLI_FREE(m, new_index);
BROTLI_FREE(m, u32);
BROTLI_FREE(m, histogram_symbols);
}
/* Create BlockSplit (partitioning) given the limits, estimates and "effort"
* parameters.
*
* NB: max_histograms is often less than number of histograms allowed by format;
* this is done intentionally, to save some "space" for context-aware
* clustering (here entropy is estimated for context-free symbols). */
static void FN(SplitByteVector)(MemoryManager* m,
const DataType* data, const size_t length,
const size_t symbols_per_histogram,
const size_t max_histograms,
const size_t sampling_stride_length,
const double block_switch_cost,
const BrotliEncoderParams* params,
BlockSplit* split) {
const size_t data_size = FN(HistogramDataSize)();
HistogramType* histograms;
HistogramType* tmp;
/* Calculate number of histograms; initial estimate is one histogram per
* specified amount of symbols; however, this value is capped. */
size_t num_histograms = length / symbols_per_histogram + 1;
if (num_histograms > max_histograms) {
num_histograms = max_histograms;
}
/* Corner case: no input. */
if (length == 0) {
split->num_types = 1;
return;
}
if (length < kMinLengthForBlockSplitting) {
BROTLI_ENSURE_CAPACITY(m, uint8_t,
split->types, split->types_alloc_size, split->num_blocks + 1);
BROTLI_ENSURE_CAPACITY(m, uint32_t,
split->lengths, split->lengths_alloc_size, split->num_blocks + 1);
if (BROTLI_IS_OOM(m)) return;
split->num_types = 1;
split->types[split->num_blocks] = 0;
split->lengths[split->num_blocks] = (uint32_t)length;
split->num_blocks++;
return;
}
histograms = BROTLI_ALLOC(m, HistogramType, num_histograms + 1);
tmp = histograms + num_histograms;
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(histograms)) return;
/* Find good entropy codes. */
FN(InitialEntropyCodes)(data, length,
sampling_stride_length,
num_histograms, histograms);
FN(RefineEntropyCodes)(data, length,
sampling_stride_length,
num_histograms, histograms, tmp);
{
/* Find a good path through literals with the good entropy codes. */
uint8_t* block_ids = BROTLI_ALLOC(m, uint8_t, length);
size_t num_blocks = 0;
const size_t bitmaplen = (num_histograms + 7) >> 3;
double* insert_cost = BROTLI_ALLOC(m, double, data_size * num_histograms);
double* cost = BROTLI_ALLOC(m, double, num_histograms);
uint8_t* switch_signal = BROTLI_ALLOC(m, uint8_t, length * bitmaplen);
uint16_t* new_id = BROTLI_ALLOC(m, uint16_t, num_histograms);
const size_t iters = params->quality < HQ_ZOPFLIFICATION_QUALITY ? 3 : 10;
size_t i;
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(block_ids) ||
BROTLI_IS_NULL(insert_cost) || BROTLI_IS_NULL(cost) ||
BROTLI_IS_NULL(switch_signal) || BROTLI_IS_NULL(new_id)) {
return;
}
for (i = 0; i < iters; ++i) {
num_blocks = FN(FindBlocks)(data, length,
block_switch_cost,
num_histograms, histograms,
insert_cost, cost, switch_signal,
block_ids);
num_histograms = FN(RemapBlockIds)(block_ids, length,
new_id, num_histograms);
FN(BuildBlockHistograms)(data, length, block_ids,
num_histograms, histograms);
}
BROTLI_FREE(m, insert_cost);
BROTLI_FREE(m, cost);
BROTLI_FREE(m, switch_signal);
BROTLI_FREE(m, new_id);
BROTLI_FREE(m, histograms);
FN(ClusterBlocks)(m, data, length, num_blocks, block_ids, split);
if (BROTLI_IS_OOM(m)) return;
BROTLI_FREE(m, block_ids);
}
}
#undef HistogramType

1337
c/enc/brotli_bit_stream.c Normal file

File diff suppressed because it is too large Load Diff

83
c/enc/brotli_bit_stream.h Normal file
View File

@@ -0,0 +1,83 @@
/* Copyright 2014 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Functions to convert brotli-related data structures into the
brotli bit stream. The functions here operate under
assumption that there is enough space in the storage, i.e., there are
no out-of-range checks anywhere.
These functions do bit addressing into a byte array. The byte array
is called "storage" and the index to the bit is called storage_ix
in function arguments. */
#ifndef BROTLI_ENC_BROTLI_BIT_STREAM_H_
#define BROTLI_ENC_BROTLI_BIT_STREAM_H_
#include "../common/context.h"
#include "../common/platform.h"
#include "command.h"
#include "entropy_encode.h"
#include "memory.h"
#include "metablock.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* All Store functions here will use a storage_ix, which is always the bit
position for the current storage. */
BROTLI_INTERNAL void BrotliStoreHuffmanTree(const uint8_t* depths, size_t num,
HuffmanTree* tree, size_t* storage_ix, uint8_t* storage);
BROTLI_INTERNAL void BrotliBuildAndStoreHuffmanTreeFast(
HuffmanTree* tree, const uint32_t* histogram, const size_t histogram_total,
const size_t max_bits, uint8_t* depth, uint16_t* bits, size_t* storage_ix,
uint8_t* storage);
/* REQUIRES: length > 0 */
/* REQUIRES: length <= (1 << 24) */
BROTLI_INTERNAL void BrotliStoreMetaBlock(MemoryManager* m,
const uint8_t* input, size_t start_pos, size_t length, size_t mask,
uint8_t prev_byte, uint8_t prev_byte2, BROTLI_BOOL is_last,
const BrotliEncoderParams* params, ContextType literal_context_mode,
const Command* commands, size_t n_commands, const MetaBlockSplit* mb,
size_t* storage_ix, uint8_t* storage);
/* Stores the meta-block without doing any block splitting, just collects
one histogram per block category and uses that for entropy coding.
REQUIRES: length > 0
REQUIRES: length <= (1 << 24) */
BROTLI_INTERNAL void BrotliStoreMetaBlockTrivial(MemoryManager* m,
const uint8_t* input, size_t start_pos, size_t length, size_t mask,
BROTLI_BOOL is_last, const BrotliEncoderParams* params,
const Command* commands, size_t n_commands,
size_t* storage_ix, uint8_t* storage);
/* Same as above, but uses static prefix codes for histograms with a only a few
symbols, and uses static code length prefix codes for all other histograms.
REQUIRES: length > 0
REQUIRES: length <= (1 << 24) */
BROTLI_INTERNAL void BrotliStoreMetaBlockFast(MemoryManager* m,
const uint8_t* input, size_t start_pos, size_t length, size_t mask,
BROTLI_BOOL is_last, const BrotliEncoderParams* params,
const Command* commands, size_t n_commands,
size_t* storage_ix, uint8_t* storage);
/* This is for storing uncompressed blocks (simple raw storage of
bytes-as-bytes).
REQUIRES: length > 0
REQUIRES: length <= (1 << 24) */
BROTLI_INTERNAL void BrotliStoreUncompressedMetaBlock(
BROTLI_BOOL is_final_block, const uint8_t* BROTLI_RESTRICT input,
size_t position, size_t mask, size_t len,
size_t* BROTLI_RESTRICT storage_ix, uint8_t* BROTLI_RESTRICT storage);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_BROTLI_BIT_STREAM_H_ */

55
c/enc/cluster.c Normal file
View File

@@ -0,0 +1,55 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Functions for clustering similar histograms together. */
#include "cluster.h"
#include "../common/platform.h"
#include "bit_cost.h" /* BrotliPopulationCost */
#include "fast_log.h"
#include "histogram.h"
#include "memory.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
static BROTLI_INLINE BROTLI_BOOL HistogramPairIsLess(
const HistogramPair* p1, const HistogramPair* p2) {
if (p1->cost_diff != p2->cost_diff) {
return TO_BROTLI_BOOL(p1->cost_diff > p2->cost_diff);
}
return TO_BROTLI_BOOL((p1->idx2 - p1->idx1) > (p2->idx2 - p2->idx1));
}
/* Returns entropy reduction of the context map when we combine two clusters. */
static BROTLI_INLINE double ClusterCostDiff(size_t size_a, size_t size_b) {
size_t size_c = size_a + size_b;
return (double)size_a * FastLog2(size_a) +
(double)size_b * FastLog2(size_b) -
(double)size_c * FastLog2(size_c);
}
#define CODE(X) X
#define FN(X) X ## Literal
#include "cluster_inc.h" /* NOLINT(build/include) */
#undef FN
#define FN(X) X ## Command
#include "cluster_inc.h" /* NOLINT(build/include) */
#undef FN
#define FN(X) X ## Distance
#include "cluster_inc.h" /* NOLINT(build/include) */
#undef FN
#undef CODE
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

47
c/enc/cluster.h Normal file
View File

@@ -0,0 +1,47 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Functions for clustering similar histograms together. */
#ifndef BROTLI_ENC_CLUSTER_H_
#define BROTLI_ENC_CLUSTER_H_
#include "../common/platform.h"
#include "histogram.h"
#include "memory.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
typedef struct HistogramPair {
uint32_t idx1;
uint32_t idx2;
double cost_combo;
double cost_diff;
} HistogramPair;
#define CODE(X) /* Declaration */;
#define FN(X) X ## Literal
#include "cluster_inc.h" /* NOLINT(build/include) */
#undef FN
#define FN(X) X ## Command
#include "cluster_inc.h" /* NOLINT(build/include) */
#undef FN
#define FN(X) X ## Distance
#include "cluster_inc.h" /* NOLINT(build/include) */
#undef FN
#undef CODE
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_CLUSTER_H_ */

325
c/enc/cluster_inc.h Normal file
View File

@@ -0,0 +1,325 @@
/* NOLINT(build/header_guard) */
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN, CODE */
#define HistogramType FN(Histogram)
/* Computes the bit cost reduction by combining out[idx1] and out[idx2] and if
it is below a threshold, stores the pair (idx1, idx2) in the *pairs queue. */
BROTLI_INTERNAL void FN(BrotliCompareAndPushToQueue)(
const HistogramType* out, HistogramType* tmp, const uint32_t* cluster_size,
uint32_t idx1, uint32_t idx2, size_t max_num_pairs, HistogramPair* pairs,
size_t* num_pairs) CODE({
BROTLI_BOOL is_good_pair = BROTLI_FALSE;
HistogramPair p;
p.idx1 = p.idx2 = 0;
p.cost_diff = p.cost_combo = 0;
if (idx1 == idx2) {
return;
}
if (idx2 < idx1) {
uint32_t t = idx2;
idx2 = idx1;
idx1 = t;
}
p.idx1 = idx1;
p.idx2 = idx2;
p.cost_diff = 0.5 * ClusterCostDiff(cluster_size[idx1], cluster_size[idx2]);
p.cost_diff -= out[idx1].bit_cost_;
p.cost_diff -= out[idx2].bit_cost_;
if (out[idx1].total_count_ == 0) {
p.cost_combo = out[idx2].bit_cost_;
is_good_pair = BROTLI_TRUE;
} else if (out[idx2].total_count_ == 0) {
p.cost_combo = out[idx1].bit_cost_;
is_good_pair = BROTLI_TRUE;
} else {
double threshold = *num_pairs == 0 ? 1e99 :
BROTLI_MAX(double, 0.0, pairs[0].cost_diff);
double cost_combo;
*tmp = out[idx1];
FN(HistogramAddHistogram)(tmp, &out[idx2]);
cost_combo = FN(BrotliPopulationCost)(tmp);
if (cost_combo < threshold - p.cost_diff) {
p.cost_combo = cost_combo;
is_good_pair = BROTLI_TRUE;
}
}
if (is_good_pair) {
p.cost_diff += p.cost_combo;
if (*num_pairs > 0 && HistogramPairIsLess(&pairs[0], &p)) {
/* Replace the top of the queue if needed. */
if (*num_pairs < max_num_pairs) {
pairs[*num_pairs] = pairs[0];
++(*num_pairs);
}
pairs[0] = p;
} else if (*num_pairs < max_num_pairs) {
pairs[*num_pairs] = p;
++(*num_pairs);
}
}
})
BROTLI_INTERNAL size_t FN(BrotliHistogramCombine)(HistogramType* out,
HistogramType* tmp,
uint32_t* cluster_size,
uint32_t* symbols,
uint32_t* clusters,
HistogramPair* pairs,
size_t num_clusters,
size_t symbols_size,
size_t max_clusters,
size_t max_num_pairs) CODE({
double cost_diff_threshold = 0.0;
size_t min_cluster_size = 1;
size_t num_pairs = 0;
{
/* We maintain a vector of histogram pairs, with the property that the pair
with the maximum bit cost reduction is the first. */
size_t idx1;
for (idx1 = 0; idx1 < num_clusters; ++idx1) {
size_t idx2;
for (idx2 = idx1 + 1; idx2 < num_clusters; ++idx2) {
FN(BrotliCompareAndPushToQueue)(out, tmp, cluster_size, clusters[idx1],
clusters[idx2], max_num_pairs, &pairs[0], &num_pairs);
}
}
}
while (num_clusters > min_cluster_size) {
uint32_t best_idx1;
uint32_t best_idx2;
size_t i;
if (pairs[0].cost_diff >= cost_diff_threshold) {
cost_diff_threshold = 1e99;
min_cluster_size = max_clusters;
continue;
}
/* Take the best pair from the top of heap. */
best_idx1 = pairs[0].idx1;
best_idx2 = pairs[0].idx2;
FN(HistogramAddHistogram)(&out[best_idx1], &out[best_idx2]);
out[best_idx1].bit_cost_ = pairs[0].cost_combo;
cluster_size[best_idx1] += cluster_size[best_idx2];
for (i = 0; i < symbols_size; ++i) {
if (symbols[i] == best_idx2) {
symbols[i] = best_idx1;
}
}
for (i = 0; i < num_clusters; ++i) {
if (clusters[i] == best_idx2) {
memmove(&clusters[i], &clusters[i + 1],
(num_clusters - i - 1) * sizeof(clusters[0]));
break;
}
}
--num_clusters;
{
/* Remove pairs intersecting the just combined best pair. */
size_t copy_to_idx = 0;
for (i = 0; i < num_pairs; ++i) {
HistogramPair* p = &pairs[i];
if (p->idx1 == best_idx1 || p->idx2 == best_idx1 ||
p->idx1 == best_idx2 || p->idx2 == best_idx2) {
/* Remove invalid pair from the queue. */
continue;
}
if (HistogramPairIsLess(&pairs[0], p)) {
/* Replace the top of the queue if needed. */
HistogramPair front = pairs[0];
pairs[0] = *p;
pairs[copy_to_idx] = front;
} else {
pairs[copy_to_idx] = *p;
}
++copy_to_idx;
}
num_pairs = copy_to_idx;
}
/* Push new pairs formed with the combined histogram to the heap. */
for (i = 0; i < num_clusters; ++i) {
FN(BrotliCompareAndPushToQueue)(out, tmp, cluster_size, best_idx1,
clusters[i], max_num_pairs, &pairs[0], &num_pairs);
}
}
return num_clusters;
})
/* What is the bit cost of moving histogram from cur_symbol to candidate. */
BROTLI_INTERNAL double FN(BrotliHistogramBitCostDistance)(
const HistogramType* histogram, const HistogramType* candidate,
HistogramType* tmp) CODE({
if (histogram->total_count_ == 0) {
return 0.0;
} else {
*tmp = *histogram;
FN(HistogramAddHistogram)(tmp, candidate);
return FN(BrotliPopulationCost)(tmp) - candidate->bit_cost_;
}
})
/* Find the best 'out' histogram for each of the 'in' histograms.
When called, clusters[0..num_clusters) contains the unique values from
symbols[0..in_size), but this property is not preserved in this function.
Note: we assume that out[]->bit_cost_ is already up-to-date. */
BROTLI_INTERNAL void FN(BrotliHistogramRemap)(const HistogramType* in,
size_t in_size, const uint32_t* clusters, size_t num_clusters,
HistogramType* out, HistogramType* tmp, uint32_t* symbols) CODE({
size_t i;
for (i = 0; i < in_size; ++i) {
uint32_t best_out = i == 0 ? symbols[0] : symbols[i - 1];
double best_bits =
FN(BrotliHistogramBitCostDistance)(&in[i], &out[best_out], tmp);
size_t j;
for (j = 0; j < num_clusters; ++j) {
const double cur_bits =
FN(BrotliHistogramBitCostDistance)(&in[i], &out[clusters[j]], tmp);
if (cur_bits < best_bits) {
best_bits = cur_bits;
best_out = clusters[j];
}
}
symbols[i] = best_out;
}
/* Recompute each out based on raw and symbols. */
for (i = 0; i < num_clusters; ++i) {
FN(HistogramClear)(&out[clusters[i]]);
}
for (i = 0; i < in_size; ++i) {
FN(HistogramAddHistogram)(&out[symbols[i]], &in[i]);
}
})
/* Reorders elements of the out[0..length) array and changes values in
symbols[0..length) array in the following way:
* when called, symbols[] contains indexes into out[], and has N unique
values (possibly N < length)
* on return, symbols'[i] = f(symbols[i]) and
out'[symbols'[i]] = out[symbols[i]], for each 0 <= i < length,
where f is a bijection between the range of symbols[] and [0..N), and
the first occurrences of values in symbols'[i] come in consecutive
increasing order.
Returns N, the number of unique values in symbols[]. */
BROTLI_INTERNAL size_t FN(BrotliHistogramReindex)(MemoryManager* m,
HistogramType* out, uint32_t* symbols, size_t length) CODE({
static const uint32_t kInvalidIndex = BROTLI_UINT32_MAX;
uint32_t* new_index = BROTLI_ALLOC(m, uint32_t, length);
uint32_t next_index;
HistogramType* tmp;
size_t i;
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(new_index)) return 0;
for (i = 0; i < length; ++i) {
new_index[i] = kInvalidIndex;
}
next_index = 0;
for (i = 0; i < length; ++i) {
if (new_index[symbols[i]] == kInvalidIndex) {
new_index[symbols[i]] = next_index;
++next_index;
}
}
/* TODO(eustas): by using idea of "cycle-sort" we can avoid allocation of
tmp and reduce the number of copying by the factor of 2. */
tmp = BROTLI_ALLOC(m, HistogramType, next_index);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(tmp)) return 0;
next_index = 0;
for (i = 0; i < length; ++i) {
if (new_index[symbols[i]] == next_index) {
tmp[next_index] = out[symbols[i]];
++next_index;
}
symbols[i] = new_index[symbols[i]];
}
BROTLI_FREE(m, new_index);
for (i = 0; i < next_index; ++i) {
out[i] = tmp[i];
}
BROTLI_FREE(m, tmp);
return next_index;
})
BROTLI_INTERNAL void FN(BrotliClusterHistograms)(
MemoryManager* m, const HistogramType* in, const size_t in_size,
size_t max_histograms, HistogramType* out, size_t* out_size,
uint32_t* histogram_symbols) CODE({
uint32_t* cluster_size = BROTLI_ALLOC(m, uint32_t, in_size);
uint32_t* clusters = BROTLI_ALLOC(m, uint32_t, in_size);
size_t num_clusters = 0;
const size_t max_input_histograms = 64;
size_t pairs_capacity = max_input_histograms * max_input_histograms / 2;
/* For the first pass of clustering, we allow all pairs. */
HistogramPair* pairs = BROTLI_ALLOC(m, HistogramPair, pairs_capacity + 1);
/* TODO(eustas): move to "persistent" arena? */
HistogramType* tmp = BROTLI_ALLOC(m, HistogramType, 1);
size_t i;
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(cluster_size) ||
BROTLI_IS_NULL(clusters) || BROTLI_IS_NULL(pairs)|| BROTLI_IS_NULL(tmp)) {
return;
}
for (i = 0; i < in_size; ++i) {
cluster_size[i] = 1;
}
for (i = 0; i < in_size; ++i) {
out[i] = in[i];
out[i].bit_cost_ = FN(BrotliPopulationCost)(&in[i]);
histogram_symbols[i] = (uint32_t)i;
}
for (i = 0; i < in_size; i += max_input_histograms) {
size_t num_to_combine =
BROTLI_MIN(size_t, in_size - i, max_input_histograms);
size_t num_new_clusters;
size_t j;
for (j = 0; j < num_to_combine; ++j) {
clusters[num_clusters + j] = (uint32_t)(i + j);
}
num_new_clusters =
FN(BrotliHistogramCombine)(out, tmp, cluster_size,
&histogram_symbols[i],
&clusters[num_clusters], pairs,
num_to_combine, num_to_combine,
max_histograms, pairs_capacity);
num_clusters += num_new_clusters;
}
{
/* For the second pass, we limit the total number of histogram pairs.
After this limit is reached, we only keep searching for the best pair. */
size_t max_num_pairs = BROTLI_MIN(size_t,
64 * num_clusters, (num_clusters / 2) * num_clusters);
BROTLI_ENSURE_CAPACITY(
m, HistogramPair, pairs, pairs_capacity, max_num_pairs + 1);
if (BROTLI_IS_OOM(m)) return;
/* Collapse similar histograms. */
num_clusters = FN(BrotliHistogramCombine)(out, tmp, cluster_size,
histogram_symbols, clusters,
pairs, num_clusters, in_size,
max_histograms, max_num_pairs);
}
BROTLI_FREE(m, pairs);
BROTLI_FREE(m, cluster_size);
/* Find the optimal map from original histograms to the final ones. */
FN(BrotliHistogramRemap)(in, in_size, clusters, num_clusters,
out, tmp, histogram_symbols);
BROTLI_FREE(m, tmp);
BROTLI_FREE(m, clusters);
/* Convert the context map to a canonical form. */
*out_size = FN(BrotliHistogramReindex)(m, out, histogram_symbols, in_size);
if (BROTLI_IS_OOM(m)) return;
})
#undef HistogramType

28
c/enc/command.c Normal file
View File

@@ -0,0 +1,28 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "command.h"
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
const uint32_t kBrotliInsBase[BROTLI_NUM_INS_COPY_CODES] = {
0, 1, 2, 3, 4, 5, 6, 8, 10, 14, 18, 26,
34, 50, 66, 98, 130, 194, 322, 578, 1090, 2114, 6210, 22594};
const uint32_t kBrotliInsExtra[BROTLI_NUM_INS_COPY_CODES] = {
0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 7, 8, 9, 10, 12, 14, 24};
const uint32_t kBrotliCopyBase[BROTLI_NUM_INS_COPY_CODES] = {
2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, 18,
22, 30, 38, 54, 70, 102, 134, 198, 326, 582, 1094, 2118};
const uint32_t kBrotliCopyExtra[BROTLI_NUM_INS_COPY_CODES] = {
0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 7, 8, 9, 10, 24};
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

189
c/enc/command.h Normal file
View File

@@ -0,0 +1,189 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* This class models a sequence of literals and a backward reference copy. */
#ifndef BROTLI_ENC_COMMAND_H_
#define BROTLI_ENC_COMMAND_H_
#include "../common/constants.h"
#include "../common/platform.h"
#include "fast_log.h"
#include "params.h"
#include "prefix.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
BROTLI_INTERNAL extern const BROTLI_MODEL("small")
uint32_t kBrotliInsBase[BROTLI_NUM_INS_COPY_CODES];
BROTLI_INTERNAL extern const BROTLI_MODEL("small")
uint32_t kBrotliInsExtra[BROTLI_NUM_INS_COPY_CODES];
BROTLI_INTERNAL extern const BROTLI_MODEL("small")
uint32_t kBrotliCopyBase[BROTLI_NUM_INS_COPY_CODES];
BROTLI_INTERNAL extern const BROTLI_MODEL("small")
uint32_t kBrotliCopyExtra[BROTLI_NUM_INS_COPY_CODES];
static BROTLI_INLINE uint16_t GetInsertLengthCode(size_t insertlen) {
if (insertlen < 6) {
return (uint16_t)insertlen;
} else if (insertlen < 130) {
uint32_t nbits = Log2FloorNonZero(insertlen - 2) - 1u;
return (uint16_t)((nbits << 1) + ((insertlen - 2) >> nbits) + 2);
} else if (insertlen < 2114) {
return (uint16_t)(Log2FloorNonZero(insertlen - 66) + 10);
} else if (insertlen < 6210) {
return 21u;
} else if (insertlen < 22594) {
return 22u;
} else {
return 23u;
}
}
static BROTLI_INLINE uint16_t GetCopyLengthCode(size_t copylen) {
if (copylen < 10) {
return (uint16_t)(copylen - 2);
} else if (copylen < 134) {
uint32_t nbits = Log2FloorNonZero(copylen - 6) - 1u;
return (uint16_t)((nbits << 1) + ((copylen - 6) >> nbits) + 4);
} else if (copylen < 2118) {
return (uint16_t)(Log2FloorNonZero(copylen - 70) + 12);
} else {
return 23u;
}
}
static BROTLI_INLINE uint16_t CombineLengthCodes(
uint16_t inscode, uint16_t copycode, BROTLI_BOOL use_last_distance) {
uint16_t bits64 =
(uint16_t)((copycode & 0x7u) | ((inscode & 0x7u) << 3u));
if (use_last_distance && inscode < 8u && copycode < 16u) {
return (copycode < 8u) ? bits64 : (bits64 | 64u);
} else {
/* Specification: 5 Encoding of ... (last table) */
/* offset = 2 * index, where index is in range [0..8] */
uint32_t offset = 2u * ((copycode >> 3u) + 3u * (inscode >> 3u));
/* All values in specification are K * 64,
where K = [2, 3, 6, 4, 5, 8, 7, 9, 10],
i + 1 = [1, 2, 3, 4, 5, 6, 7, 8, 9],
K - i - 1 = [1, 1, 3, 0, 0, 2, 0, 1, 2] = D.
All values in D require only 2 bits to encode.
Magic constant is shifted 6 bits left, to avoid final multiplication. */
offset = (offset << 5u) + 0x40u + ((0x520D40u >> offset) & 0xC0u);
return (uint16_t)(offset | bits64);
}
}
static BROTLI_INLINE void GetLengthCode(size_t insertlen, size_t copylen,
BROTLI_BOOL use_last_distance,
uint16_t* code) {
uint16_t inscode = GetInsertLengthCode(insertlen);
uint16_t copycode = GetCopyLengthCode(copylen);
*code = CombineLengthCodes(inscode, copycode, use_last_distance);
}
static BROTLI_INLINE uint32_t GetInsertBase(uint16_t inscode) {
return kBrotliInsBase[inscode];
}
static BROTLI_INLINE uint32_t GetInsertExtra(uint16_t inscode) {
return kBrotliInsExtra[inscode];
}
static BROTLI_INLINE uint32_t GetCopyBase(uint16_t copycode) {
return kBrotliCopyBase[copycode];
}
static BROTLI_INLINE uint32_t GetCopyExtra(uint16_t copycode) {
return kBrotliCopyExtra[copycode];
}
typedef struct Command {
uint32_t insert_len_;
/* Stores copy_len in low 25 bits and copy_code - copy_len in high 7 bit. */
uint32_t copy_len_;
/* Stores distance extra bits. */
uint32_t dist_extra_;
uint16_t cmd_prefix_;
/* Stores distance code in low 10 bits
and number of extra bits in high 6 bits. */
uint16_t dist_prefix_;
} Command;
/* distance_code is e.g. 0 for same-as-last short code, or 16 for offset 1. */
static BROTLI_INLINE void InitCommand(Command* self,
const BrotliDistanceParams* dist, size_t insertlen,
size_t copylen, int copylen_code_delta, size_t distance_code) {
/* Don't rely on signed int representation, use honest casts. */
uint32_t delta = (uint8_t)((int8_t)copylen_code_delta);
self->insert_len_ = (uint32_t)insertlen;
self->copy_len_ = (uint32_t)(copylen | (delta << 25));
/* The distance prefix and extra bits are stored in this Command as if
npostfix and ndirect were 0, they are only recomputed later after the
clustering if needed. */
PrefixEncodeCopyDistance(
distance_code, dist->num_direct_distance_codes,
dist->distance_postfix_bits, &self->dist_prefix_, &self->dist_extra_);
GetLengthCode(
insertlen, (size_t)((int)copylen + copylen_code_delta),
TO_BROTLI_BOOL((self->dist_prefix_ & 0x3FF) == 0), &self->cmd_prefix_);
}
static BROTLI_INLINE void InitInsertCommand(Command* self, size_t insertlen) {
self->insert_len_ = (uint32_t)insertlen;
self->copy_len_ = 4 << 25;
self->dist_extra_ = 0;
self->dist_prefix_ = BROTLI_NUM_DISTANCE_SHORT_CODES;
GetLengthCode(insertlen, 4, BROTLI_FALSE, &self->cmd_prefix_);
}
static BROTLI_INLINE uint32_t CommandRestoreDistanceCode(
const Command* self, const BrotliDistanceParams* dist) {
if ((self->dist_prefix_ & 0x3FFu) <
BROTLI_NUM_DISTANCE_SHORT_CODES + dist->num_direct_distance_codes) {
return self->dist_prefix_ & 0x3FFu;
} else {
uint32_t dcode = self->dist_prefix_ & 0x3FFu;
uint32_t nbits = self->dist_prefix_ >> 10;
uint32_t extra = self->dist_extra_;
uint32_t postfix_mask = (1U << dist->distance_postfix_bits) - 1U;
uint32_t hcode = (dcode - dist->num_direct_distance_codes -
BROTLI_NUM_DISTANCE_SHORT_CODES) >>
dist->distance_postfix_bits;
uint32_t lcode = (dcode - dist->num_direct_distance_codes -
BROTLI_NUM_DISTANCE_SHORT_CODES) & postfix_mask;
uint32_t offset = ((2U + (hcode & 1U)) << nbits) - 4U;
return ((offset + extra) << dist->distance_postfix_bits) + lcode +
dist->num_direct_distance_codes + BROTLI_NUM_DISTANCE_SHORT_CODES;
}
}
static BROTLI_INLINE uint32_t CommandDistanceContext(const Command* self) {
uint32_t r = self->cmd_prefix_ >> 6;
uint32_t c = self->cmd_prefix_ & 7;
if ((r == 0 || r == 2 || r == 4 || r == 7) && (c <= 2)) {
return c;
}
return 3;
}
static BROTLI_INLINE uint32_t CommandCopyLen(const Command* self) {
return self->copy_len_ & 0x1FFFFFF;
}
static BROTLI_INLINE uint32_t CommandCopyLenCode(const Command* self) {
uint32_t modifier = self->copy_len_ >> 25;
int32_t delta = (int8_t)((uint8_t)(modifier | ((modifier & 0x40) << 1)));
return (uint32_t)((int32_t)(self->copy_len_ & 0x1FFFFFF) + delta);
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_COMMAND_H_ */

205
c/enc/compound_dictionary.c Normal file
View File

@@ -0,0 +1,205 @@
/* Copyright 2017 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "compound_dictionary.h"
#include "../common/platform.h"
#include <brotli/shared_dictionary.h>
#include "memory.h"
static PreparedDictionary* CreatePreparedDictionaryWithParams(MemoryManager* m,
const uint8_t* source, size_t source_size, uint32_t bucket_bits,
uint32_t slot_bits, uint32_t hash_bits, uint16_t bucket_limit) {
/* Step 1: create "bloated" hasher. */
uint32_t num_slots = 1u << slot_bits;
uint32_t num_buckets = 1u << bucket_bits;
uint32_t hash_shift = 64u - bucket_bits;
uint64_t hash_mask = (~((uint64_t)0U)) >> (64 - hash_bits);
uint32_t slot_mask = num_slots - 1;
size_t alloc_size = (sizeof(uint32_t) << slot_bits) +
(sizeof(uint32_t) << slot_bits) +
(sizeof(uint16_t) << bucket_bits) +
(sizeof(uint32_t) << bucket_bits) +
(sizeof(uint32_t) * source_size);
uint8_t* flat = NULL;
PreparedDictionary* result = NULL;
uint16_t* num = NULL;
uint32_t* bucket_heads = NULL;
uint32_t* next_bucket = NULL;
uint32_t* slot_offsets = NULL;
uint16_t* heads = NULL;
uint32_t* items = NULL;
uint8_t** source_ref = NULL;
uint32_t i;
uint32_t* slot_size = NULL;
uint32_t* slot_limit = NULL;
uint32_t total_items = 0;
if (slot_bits > 16) return NULL;
if (slot_bits > bucket_bits) return NULL;
if (bucket_bits - slot_bits >= 16) return NULL;
flat = BROTLI_ALLOC(m, uint8_t, alloc_size);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(flat)) return NULL;
slot_size = (uint32_t*)flat;
slot_limit = (uint32_t*)(&slot_size[num_slots]);
num = (uint16_t*)(&slot_limit[num_slots]);
bucket_heads = (uint32_t*)(&num[num_buckets]);
next_bucket = (uint32_t*)(&bucket_heads[num_buckets]);
memset(num, 0, num_buckets * sizeof(num[0]));
/* TODO(eustas): apply custom "store" order. */
for (i = 0; i + 7 < source_size; ++i) {
const uint64_t h = (BROTLI_UNALIGNED_LOAD64LE(&source[i]) & hash_mask) *
kPreparedDictionaryHashMul64Long;
const uint32_t key = (uint32_t)(h >> hash_shift);
uint16_t count = num[key];
next_bucket[i] = (count == 0) ? ((uint32_t)(-1)) : bucket_heads[key];
bucket_heads[key] = i;
count++;
if (count > bucket_limit) count = bucket_limit;
num[key] = count;
}
/* Step 2: find slot limits. */
for (i = 0; i < num_slots; ++i) {
BROTLI_BOOL overflow = BROTLI_FALSE;
slot_limit[i] = bucket_limit;
while (BROTLI_TRUE) {
uint32_t limit = slot_limit[i];
size_t j;
uint32_t count = 0;
overflow = BROTLI_FALSE;
for (j = i; j < num_buckets; j += num_slots) {
uint32_t size = num[j];
/* Last chain may span behind 64K limit; overflow happens only if
we are about to use 0xFFFF+ as item offset. */
if (count >= 0xFFFF) {
overflow = BROTLI_TRUE;
break;
}
if (size > limit) size = limit;
count += size;
}
if (!overflow) {
slot_size[i] = count;
total_items += count;
break;
}
slot_limit[i]--;
}
}
/* Step 3: transfer data to "slim" hasher. */
alloc_size = sizeof(PreparedDictionary) + (sizeof(uint32_t) << slot_bits) +
(sizeof(uint16_t) << bucket_bits) + (sizeof(uint32_t) * total_items) +
sizeof(uint8_t*);
result = (PreparedDictionary*)BROTLI_ALLOC(m, uint8_t, alloc_size);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(result)) {
BROTLI_FREE(m, flat);
return NULL;
}
slot_offsets = (uint32_t*)(&result[1]);
heads = (uint16_t*)(&slot_offsets[num_slots]);
items = (uint32_t*)(&heads[num_buckets]);
source_ref = (uint8_t**)(&items[total_items]);
result->magic = kLeanPreparedDictionaryMagic;
result->num_items = total_items;
result->source_size = (uint32_t)source_size;
result->hash_bits = hash_bits;
result->bucket_bits = bucket_bits;
result->slot_bits = slot_bits;
BROTLI_UNALIGNED_STORE_PTR(source_ref, source);
total_items = 0;
for (i = 0; i < num_slots; ++i) {
slot_offsets[i] = total_items;
total_items += slot_size[i];
slot_size[i] = 0;
}
for (i = 0; i < num_buckets; ++i) {
uint32_t slot = i & slot_mask;
uint32_t count = num[i];
uint32_t pos;
size_t j;
size_t cursor = slot_size[slot];
if (count > slot_limit[slot]) count = slot_limit[slot];
if (count == 0) {
heads[i] = 0xFFFF;
continue;
}
heads[i] = (uint16_t)cursor;
cursor += slot_offsets[slot];
slot_size[slot] += count;
pos = bucket_heads[i];
for (j = 0; j < count; j++) {
items[cursor++] = pos;
pos = next_bucket[pos];
}
items[cursor - 1] |= 0x80000000;
}
BROTLI_FREE(m, flat);
return result;
}
PreparedDictionary* CreatePreparedDictionary(MemoryManager* m,
const uint8_t* source, size_t source_size) {
uint32_t bucket_bits = 17;
uint32_t slot_bits = 7;
uint32_t hash_bits = 40;
uint16_t bucket_limit = 32;
size_t volume = 16u << bucket_bits;
/* Tune parameters to fit dictionary size. */
while (volume < source_size && bucket_bits < 22) {
bucket_bits++;
slot_bits++;
volume <<= 1;
}
return CreatePreparedDictionaryWithParams(m,
source, source_size, bucket_bits, slot_bits, hash_bits, bucket_limit);
}
void DestroyPreparedDictionary(MemoryManager* m,
PreparedDictionary* dictionary) {
if (!dictionary) return;
BROTLI_FREE(m, dictionary);
}
BROTLI_BOOL AttachPreparedDictionary(
CompoundDictionary* compound, const PreparedDictionary* dictionary) {
size_t length = 0;
size_t index = 0;
if (compound->num_chunks == SHARED_BROTLI_MAX_COMPOUND_DICTS) {
return BROTLI_FALSE;
}
if (!dictionary) return BROTLI_FALSE;
length = dictionary->source_size;
index = compound->num_chunks;
compound->total_size += length;
compound->chunks[index] = dictionary;
compound->chunk_offsets[index + 1] = compound->total_size;
{
uint32_t* slot_offsets = (uint32_t*)(&dictionary[1]);
uint16_t* heads = (uint16_t*)(&slot_offsets[(size_t)1u << dictionary->slot_bits]);
uint32_t* items = (uint32_t*)(&heads[(size_t)1u << dictionary->bucket_bits]);
const void* tail = (void*)&items[dictionary->num_items];
if (dictionary->magic == kPreparedDictionaryMagic) {
compound->chunk_source[index] = (const uint8_t*)tail;
} else {
/* dictionary->magic == kLeanPreparedDictionaryMagic */
compound->chunk_source[index] =
(const uint8_t*)BROTLI_UNALIGNED_LOAD_PTR((const uint8_t**)tail);
}
}
compound->num_chunks++;
return BROTLI_TRUE;
}

View File

@@ -0,0 +1,71 @@
/* Copyright 2017 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#ifndef BROTLI_ENC_PREPARED_DICTIONARY_H_
#define BROTLI_ENC_PREPARED_DICTIONARY_H_
#include "../common/platform.h"
#include <brotli/shared_dictionary.h>
#include "memory.h"
/* "Fat" prepared dictionary, could be cooked outside of C implementation,
* e.g. on Java side. LZ77 data is copied inside PreparedDictionary struct. */
static const uint32_t kPreparedDictionaryMagic = 0xDEBCEDE0;
static const uint32_t kSharedDictionaryMagic = 0xDEBCEDE1;
static const uint32_t kManagedDictionaryMagic = 0xDEBCEDE2;
/* "Lean" prepared dictionary. LZ77 data is referenced. It is the responsibility
* of caller of "prepare dictionary" to keep the LZ77 data while prepared
* dictionary is in use. */
static const uint32_t kLeanPreparedDictionaryMagic = 0xDEBCEDE3;
static const uint64_t kPreparedDictionaryHashMul64Long =
BROTLI_MAKE_UINT64_T(0x1FE35A7Bu, 0xD3579BD3u);
typedef struct PreparedDictionary {
uint32_t magic;
uint32_t num_items;
uint32_t source_size;
uint32_t hash_bits;
uint32_t bucket_bits;
uint32_t slot_bits;
/* --- Dynamic size members --- */
/* uint32_t slot_offsets[1 << slot_bits]; */
/* uint16_t heads[1 << bucket_bits]; */
/* uint32_t items[variable]; */
/* [maybe] uint8_t* source_ref, depending on magic. */
/* [maybe] uint8_t source[source_size], depending on magic. */
} PreparedDictionary;
BROTLI_INTERNAL PreparedDictionary* CreatePreparedDictionary(MemoryManager* m,
const uint8_t* source, size_t source_size);
BROTLI_INTERNAL void DestroyPreparedDictionary(MemoryManager* m,
PreparedDictionary* dictionary);
typedef struct CompoundDictionary {
/* LZ77 prefix, compound dictionary */
size_t num_chunks;
size_t total_size;
/* Client instances. */
const PreparedDictionary* chunks[SHARED_BROTLI_MAX_COMPOUND_DICTS + 1];
const uint8_t* chunk_source[SHARED_BROTLI_MAX_COMPOUND_DICTS + 1];
size_t chunk_offsets[SHARED_BROTLI_MAX_COMPOUND_DICTS + 1];
size_t num_prepared_instances_;
/* Owned instances. */
PreparedDictionary* prepared_instances_[SHARED_BROTLI_MAX_COMPOUND_DICTS + 1];
} CompoundDictionary;
BROTLI_INTERNAL BROTLI_BOOL AttachPreparedDictionary(
CompoundDictionary* compound, const PreparedDictionary* dictionary);
#endif /* BROTLI_ENC_PREPARED_DICTIONARY */

790
c/enc/compress_fragment.c Normal file
View File

@@ -0,0 +1,790 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function for fast encoding of an input fragment, independently from the input
history. This function uses one-pass processing: when we find a backward
match, we immediately emit the corresponding command and literal codes to
the bit stream.
Adapted from the CompressFragment() function in
https://github.com/google/snappy/blob/master/snappy.cc */
#include "compress_fragment.h"
#include "../common/constants.h"
#include "../common/platform.h"
#include "brotli_bit_stream.h"
#include "entropy_encode.h"
#include "fast_log.h"
#include "find_match_length.h"
#include "hash_base.h"
#include "write_bits.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#define MAX_DISTANCE (long)BROTLI_MAX_BACKWARD_LIMIT(18)
static BROTLI_INLINE uint32_t Hash(const uint8_t* p, size_t shift) {
const uint64_t h = (BROTLI_UNALIGNED_LOAD64LE(p) << 24) * kHashMul32;
return (uint32_t)(h >> shift);
}
static BROTLI_INLINE uint32_t HashBytesAtOffset(
uint64_t v, int offset, size_t shift) {
BROTLI_DCHECK(offset >= 0);
BROTLI_DCHECK(offset <= 3);
{
const uint64_t h = ((v >> (8 * offset)) << 24) * kHashMul32;
return (uint32_t)(h >> shift);
}
}
static BROTLI_INLINE BROTLI_BOOL IsMatch(const uint8_t* p1, const uint8_t* p2) {
return TO_BROTLI_BOOL(
BrotliUnalignedRead32(p1) == BrotliUnalignedRead32(p2) &&
p1[4] == p2[4]);
}
/* Builds a literal prefix code into "depths" and "bits" based on the statistics
of the "input" string and stores it into the bit stream.
Note that the prefix code here is built from the pre-LZ77 input, therefore
we can only approximate the statistics of the actual literal stream.
Moreover, for long inputs we build a histogram from a sample of the input
and thus have to assign a non-zero depth for each literal.
Returns estimated compression ratio millibytes/char for encoding given input
with generated code. */
static size_t BuildAndStoreLiteralPrefixCode(BrotliOnePassArena* s,
const uint8_t* input,
const size_t input_size,
uint8_t depths[256],
uint16_t bits[256],
size_t* storage_ix,
uint8_t* storage) {
uint32_t* BROTLI_RESTRICT const histogram = s->histogram;
size_t histogram_total;
size_t i;
memset(histogram, 0, sizeof(s->histogram));
if (input_size < (1 << 15)) {
for (i = 0; i < input_size; ++i) {
++histogram[input[i]];
}
histogram_total = input_size;
for (i = 0; i < 256; ++i) {
/* We weigh the first 11 samples with weight 3 to account for the
balancing effect of the LZ77 phase on the histogram. */
const uint32_t adjust = 2 * BROTLI_MIN(uint32_t, histogram[i], 11u);
histogram[i] += adjust;
histogram_total += adjust;
}
} else {
static const size_t kSampleRate = 29;
for (i = 0; i < input_size; i += kSampleRate) {
++histogram[input[i]];
}
histogram_total = (input_size + kSampleRate - 1) / kSampleRate;
for (i = 0; i < 256; ++i) {
/* We add 1 to each population count to avoid 0 bit depths (since this is
only a sample and we don't know if the symbol appears or not), and we
weigh the first 11 samples with weight 3 to account for the balancing
effect of the LZ77 phase on the histogram (more frequent symbols are
more likely to be in backward references instead as literals). */
const uint32_t adjust = 1 + 2 * BROTLI_MIN(uint32_t, histogram[i], 11u);
histogram[i] += adjust;
histogram_total += adjust;
}
}
BrotliBuildAndStoreHuffmanTreeFast(s->tree, histogram, histogram_total,
/* max_bits = */ 8,
depths, bits, storage_ix, storage);
{
size_t literal_ratio = 0;
for (i = 0; i < 256; ++i) {
if (histogram[i]) literal_ratio += histogram[i] * depths[i];
}
/* Estimated encoding ratio, millibytes per symbol. */
return (literal_ratio * 125) / histogram_total;
}
}
/* Builds a command and distance prefix code (each 64 symbols) into "depth" and
"bits" based on "histogram" and stores it into the bit stream. */
static void BuildAndStoreCommandPrefixCode(BrotliOnePassArena* s,
size_t* storage_ix, uint8_t* storage) {
const uint32_t* const histogram = s->cmd_histo;
uint8_t* const depth = s->cmd_depth;
uint16_t* const bits = s->cmd_bits;
uint8_t* BROTLI_RESTRICT const tmp_depth = s->tmp_depth;
uint16_t* BROTLI_RESTRICT const tmp_bits = s->tmp_bits;
/* TODO(eustas): do only once on initialization. */
memset(tmp_depth, 0, BROTLI_NUM_COMMAND_SYMBOLS);
BrotliCreateHuffmanTree(histogram, 64, 15, s->tree, depth);
BrotliCreateHuffmanTree(&histogram[64], 64, 14, s->tree, &depth[64]);
/* We have to jump through a few hoops here in order to compute
the command bits because the symbols are in a different order than in
the full alphabet. This looks complicated, but having the symbols
in this order in the command bits saves a few branches in the Emit*
functions. */
memcpy(tmp_depth, depth, 24);
memcpy(tmp_depth + 24, depth + 40, 8);
memcpy(tmp_depth + 32, depth + 24, 8);
memcpy(tmp_depth + 40, depth + 48, 8);
memcpy(tmp_depth + 48, depth + 32, 8);
memcpy(tmp_depth + 56, depth + 56, 8);
BrotliConvertBitDepthsToSymbols(tmp_depth, 64, tmp_bits);
memcpy(bits, tmp_bits, 48);
memcpy(bits + 24, tmp_bits + 32, 16);
memcpy(bits + 32, tmp_bits + 48, 16);
memcpy(bits + 40, tmp_bits + 24, 16);
memcpy(bits + 48, tmp_bits + 40, 16);
memcpy(bits + 56, tmp_bits + 56, 16);
BrotliConvertBitDepthsToSymbols(&depth[64], 64, &bits[64]);
{
/* Create the bit length array for the full command alphabet. */
size_t i;
memset(tmp_depth, 0, 64); /* only 64 first values were used */
memcpy(tmp_depth, depth, 8);
memcpy(tmp_depth + 64, depth + 8, 8);
memcpy(tmp_depth + 128, depth + 16, 8);
memcpy(tmp_depth + 192, depth + 24, 8);
memcpy(tmp_depth + 384, depth + 32, 8);
for (i = 0; i < 8; ++i) {
tmp_depth[128 + 8 * i] = depth[40 + i];
tmp_depth[256 + 8 * i] = depth[48 + i];
tmp_depth[448 + 8 * i] = depth[56 + i];
}
/* TODO(eustas): could/should full-length machinery be avoided? */
BrotliStoreHuffmanTree(
tmp_depth, BROTLI_NUM_COMMAND_SYMBOLS, s->tree, storage_ix, storage);
}
BrotliStoreHuffmanTree(&depth[64], 64, s->tree, storage_ix, storage);
}
/* REQUIRES: insertlen < 6210 */
static BROTLI_INLINE void EmitInsertLen(size_t insertlen,
const uint8_t depth[128],
const uint16_t bits[128],
uint32_t histo[128],
size_t* storage_ix,
uint8_t* storage) {
if (insertlen < 6) {
const size_t code = insertlen + 40;
BrotliWriteBits(depth[code], bits[code], storage_ix, storage);
++histo[code];
} else if (insertlen < 130) {
const size_t tail = insertlen - 2;
const uint32_t nbits = Log2FloorNonZero(tail) - 1u;
const size_t prefix = tail >> nbits;
const size_t inscode = (nbits << 1) + prefix + 42;
BrotliWriteBits(depth[inscode], bits[inscode], storage_ix, storage);
BrotliWriteBits(nbits, tail - (prefix << nbits), storage_ix, storage);
++histo[inscode];
} else if (insertlen < 2114) {
const size_t tail = insertlen - 66;
const uint32_t nbits = Log2FloorNonZero(tail);
const size_t code = nbits + 50;
BrotliWriteBits(depth[code], bits[code], storage_ix, storage);
BrotliWriteBits(nbits, tail - ((size_t)1 << nbits), storage_ix, storage);
++histo[code];
} else {
BrotliWriteBits(depth[61], bits[61], storage_ix, storage);
BrotliWriteBits(12, insertlen - 2114, storage_ix, storage);
++histo[61];
}
}
static BROTLI_INLINE void EmitLongInsertLen(size_t insertlen,
const uint8_t depth[128],
const uint16_t bits[128],
uint32_t histo[128],
size_t* storage_ix,
uint8_t* storage) {
if (insertlen < 22594) {
BrotliWriteBits(depth[62], bits[62], storage_ix, storage);
BrotliWriteBits(14, insertlen - 6210, storage_ix, storage);
++histo[62];
} else {
BrotliWriteBits(depth[63], bits[63], storage_ix, storage);
BrotliWriteBits(24, insertlen - 22594, storage_ix, storage);
++histo[63];
}
}
static BROTLI_INLINE void EmitCopyLen(size_t copylen,
const uint8_t depth[128],
const uint16_t bits[128],
uint32_t histo[128],
size_t* storage_ix,
uint8_t* storage) {
if (copylen < 10) {
BrotliWriteBits(
depth[copylen + 14], bits[copylen + 14], storage_ix, storage);
++histo[copylen + 14];
} else if (copylen < 134) {
const size_t tail = copylen - 6;
const uint32_t nbits = Log2FloorNonZero(tail) - 1u;
const size_t prefix = tail >> nbits;
const size_t code = (nbits << 1) + prefix + 20;
BrotliWriteBits(depth[code], bits[code], storage_ix, storage);
BrotliWriteBits(nbits, tail - (prefix << nbits), storage_ix, storage);
++histo[code];
} else if (copylen < 2118) {
const size_t tail = copylen - 70;
const uint32_t nbits = Log2FloorNonZero(tail);
const size_t code = nbits + 28;
BrotliWriteBits(depth[code], bits[code], storage_ix, storage);
BrotliWriteBits(nbits, tail - ((size_t)1 << nbits), storage_ix, storage);
++histo[code];
} else {
BrotliWriteBits(depth[39], bits[39], storage_ix, storage);
BrotliWriteBits(24, copylen - 2118, storage_ix, storage);
++histo[39];
}
}
static BROTLI_INLINE void EmitCopyLenLastDistance(size_t copylen,
const uint8_t depth[128],
const uint16_t bits[128],
uint32_t histo[128],
size_t* storage_ix,
uint8_t* storage) {
if (copylen < 12) {
BrotliWriteBits(depth[copylen - 4], bits[copylen - 4], storage_ix, storage);
++histo[copylen - 4];
} else if (copylen < 72) {
const size_t tail = copylen - 8;
const uint32_t nbits = Log2FloorNonZero(tail) - 1;
const size_t prefix = tail >> nbits;
const size_t code = (nbits << 1) + prefix + 4;
BrotliWriteBits(depth[code], bits[code], storage_ix, storage);
BrotliWriteBits(nbits, tail - (prefix << nbits), storage_ix, storage);
++histo[code];
} else if (copylen < 136) {
const size_t tail = copylen - 8;
const size_t code = (tail >> 5) + 30;
BrotliWriteBits(depth[code], bits[code], storage_ix, storage);
BrotliWriteBits(5, tail & 31, storage_ix, storage);
BrotliWriteBits(depth[64], bits[64], storage_ix, storage);
++histo[code];
++histo[64];
} else if (copylen < 2120) {
const size_t tail = copylen - 72;
const uint32_t nbits = Log2FloorNonZero(tail);
const size_t code = nbits + 28;
BrotliWriteBits(depth[code], bits[code], storage_ix, storage);
BrotliWriteBits(nbits, tail - ((size_t)1 << nbits), storage_ix, storage);
BrotliWriteBits(depth[64], bits[64], storage_ix, storage);
++histo[code];
++histo[64];
} else {
BrotliWriteBits(depth[39], bits[39], storage_ix, storage);
BrotliWriteBits(24, copylen - 2120, storage_ix, storage);
BrotliWriteBits(depth[64], bits[64], storage_ix, storage);
++histo[39];
++histo[64];
}
}
static BROTLI_INLINE void EmitDistance(size_t distance,
const uint8_t depth[128],
const uint16_t bits[128],
uint32_t histo[128],
size_t* storage_ix, uint8_t* storage) {
const size_t d = distance + 3;
const uint32_t nbits = Log2FloorNonZero(d) - 1u;
const size_t prefix = (d >> nbits) & 1;
const size_t offset = (2 + prefix) << nbits;
const size_t distcode = 2 * (nbits - 1) + prefix + 80;
BrotliWriteBits(depth[distcode], bits[distcode], storage_ix, storage);
BrotliWriteBits(nbits, d - offset, storage_ix, storage);
++histo[distcode];
}
static BROTLI_INLINE void EmitLiterals(const uint8_t* input, const size_t len,
const uint8_t depth[256],
const uint16_t bits[256],
size_t* storage_ix, uint8_t* storage) {
size_t j;
for (j = 0; j < len; j++) {
const uint8_t lit = input[j];
BrotliWriteBits(depth[lit], bits[lit], storage_ix, storage);
}
}
/* REQUIRES: len <= 1 << 24. */
static void BrotliStoreMetaBlockHeader(
size_t len, BROTLI_BOOL is_uncompressed, size_t* storage_ix,
uint8_t* storage) {
size_t nibbles = 6;
/* ISLAST */
BrotliWriteBits(1, 0, storage_ix, storage);
if (len <= (1U << 16)) {
nibbles = 4;
} else if (len <= (1U << 20)) {
nibbles = 5;
}
BrotliWriteBits(2, nibbles - 4, storage_ix, storage);
BrotliWriteBits(nibbles * 4, len - 1, storage_ix, storage);
/* ISUNCOMPRESSED */
BrotliWriteBits(1, (uint64_t)is_uncompressed, storage_ix, storage);
}
static void UpdateBits(size_t n_bits, uint32_t bits, size_t pos,
uint8_t* array) {
while (n_bits > 0) {
size_t byte_pos = pos >> 3;
size_t n_unchanged_bits = pos & 7;
size_t n_changed_bits = BROTLI_MIN(size_t, n_bits, 8 - n_unchanged_bits);
size_t total_bits = n_unchanged_bits + n_changed_bits;
uint32_t mask =
(~((1u << total_bits) - 1u)) | ((1u << n_unchanged_bits) - 1u);
uint32_t unchanged_bits = array[byte_pos] & mask;
uint32_t changed_bits = bits & ((1u << n_changed_bits) - 1u);
array[byte_pos] =
(uint8_t)((changed_bits << n_unchanged_bits) | unchanged_bits);
n_bits -= n_changed_bits;
bits >>= n_changed_bits;
pos += n_changed_bits;
}
}
static void RewindBitPosition(const size_t new_storage_ix,
size_t* storage_ix, uint8_t* storage) {
const size_t bitpos = new_storage_ix & 7;
const size_t mask = (1u << bitpos) - 1;
storage[new_storage_ix >> 3] &= (uint8_t)mask;
*storage_ix = new_storage_ix;
}
static BROTLI_BOOL ShouldMergeBlock(BrotliOnePassArena* s,
const uint8_t* data, size_t len, const uint8_t* depths) {
uint32_t* BROTLI_RESTRICT const histo = s->histogram;
static const size_t kSampleRate = 43;
size_t i;
memset(histo, 0, sizeof(s->histogram));
for (i = 0; i < len; i += kSampleRate) {
++histo[data[i]];
}
{
const size_t total = (len + kSampleRate - 1) / kSampleRate;
double r = (FastLog2(total) + 0.5) * (double)total + 200;
for (i = 0; i < 256; ++i) {
r -= (double)histo[i] * (depths[i] + FastLog2(histo[i]));
}
return TO_BROTLI_BOOL(r >= 0.0);
}
}
/* Acceptable loss for uncompressible speedup is 2% */
#define MIN_RATIO 980
static BROTLI_INLINE BROTLI_BOOL ShouldUseUncompressedMode(
const uint8_t* metablock_start, const uint8_t* next_emit,
const size_t insertlen, const size_t literal_ratio) {
const size_t compressed = (size_t)(next_emit - metablock_start);
if (compressed * 50 > insertlen) {
return BROTLI_FALSE;
} else {
return TO_BROTLI_BOOL(literal_ratio > MIN_RATIO);
}
}
static void EmitUncompressedMetaBlock(const uint8_t* begin, const uint8_t* end,
const size_t storage_ix_start,
size_t* storage_ix, uint8_t* storage) {
const size_t len = (size_t)(end - begin);
RewindBitPosition(storage_ix_start, storage_ix, storage);
BrotliStoreMetaBlockHeader(len, 1, storage_ix, storage);
*storage_ix = (*storage_ix + 7u) & ~7u;
memcpy(&storage[*storage_ix >> 3], begin, len);
*storage_ix += len << 3;
storage[*storage_ix >> 3] = 0;
}
static BROTLI_MODEL("small") uint32_t kCmdHistoSeed[128] = {
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 0, 0, 0, 0,
};
static BROTLI_INLINE void BrotliCompressFragmentFastImpl(
BrotliOnePassArena* s, const uint8_t* input, size_t input_size,
BROTLI_BOOL is_last, int* table, size_t table_bits,
size_t* storage_ix, uint8_t* storage) {
uint8_t* BROTLI_RESTRICT const cmd_depth = s->cmd_depth;
uint16_t* BROTLI_RESTRICT const cmd_bits = s->cmd_bits;
uint32_t* BROTLI_RESTRICT const cmd_histo = s->cmd_histo;
uint8_t* BROTLI_RESTRICT const lit_depth = s->lit_depth;
uint16_t* BROTLI_RESTRICT const lit_bits = s->lit_bits;
const uint8_t* ip_end;
/* "next_emit" is a pointer to the first byte that is not covered by a
previous copy. Bytes between "next_emit" and the start of the next copy or
the end of the input will be emitted as literal bytes. */
const uint8_t* next_emit = input;
/* Save the start of the first block for position and distance computations.
*/
const uint8_t* base_ip = input;
static const size_t kFirstBlockSize = 3 << 15;
static const size_t kMergeBlockSize = 1 << 16;
const size_t kInputMarginBytes = BROTLI_WINDOW_GAP;
const size_t kMinMatchLen = 5;
const uint8_t* metablock_start = input;
size_t block_size = BROTLI_MIN(size_t, input_size, kFirstBlockSize);
size_t total_block_size = block_size;
/* Save the bit position of the MLEN field of the meta-block header, so that
we can update it later if we decide to extend this meta-block. */
size_t mlen_storage_ix = *storage_ix + 3;
size_t literal_ratio;
const uint8_t* ip;
int last_distance;
const size_t shift = 64u - table_bits;
BrotliStoreMetaBlockHeader(block_size, 0, storage_ix, storage);
/* No block splits, no contexts. */
BrotliWriteBits(13, 0, storage_ix, storage);
literal_ratio = BuildAndStoreLiteralPrefixCode(
s, input, block_size, s->lit_depth, s->lit_bits, storage_ix, storage);
{
/* Store the pre-compressed command and distance prefix codes. */
size_t i;
for (i = 0; i + 7 < s->cmd_code_numbits; i += 8) {
BrotliWriteBits(8, s->cmd_code[i >> 3], storage_ix, storage);
}
}
BrotliWriteBits(s->cmd_code_numbits & 7,
s->cmd_code[s->cmd_code_numbits >> 3], storage_ix, storage);
emit_commands:
/* Initialize the command and distance histograms. We will gather
statistics of command and distance codes during the processing
of this block and use it to update the command and distance
prefix codes for the next block. */
memcpy(s->cmd_histo, kCmdHistoSeed, sizeof(kCmdHistoSeed));
/* "ip" is the input pointer. */
ip = input;
last_distance = -1;
ip_end = input + block_size;
if (BROTLI_PREDICT_TRUE(block_size >= kInputMarginBytes)) {
/* For the last block, we need to keep a 16 bytes margin so that we can be
sure that all distances are at most window size - 16.
For all other blocks, we only need to keep a margin of 5 bytes so that
we don't go over the block size with a copy. */
const size_t len_limit = BROTLI_MIN(size_t, block_size - kMinMatchLen,
input_size - kInputMarginBytes);
const uint8_t* ip_limit = input + len_limit;
uint32_t next_hash;
for (next_hash = Hash(++ip, shift); ; ) {
/* Step 1: Scan forward in the input looking for a 5-byte-long match.
If we get close to exhausting the input then goto emit_remainder.
Heuristic match skipping: If 32 bytes are scanned with no matches
found, start looking only at every other byte. If 32 more bytes are
scanned, look at every third byte, etc.. When a match is found,
immediately go back to looking at every byte. This is a small loss
(~5% performance, ~0.1% density) for compressible data due to more
bookkeeping, but for non-compressible data (such as JPEG) it's a huge
win since the compressor quickly "realizes" the data is incompressible
and doesn't bother looking for matches everywhere.
The "skip" variable keeps track of how many bytes there are since the
last match; dividing it by 32 (i.e. right-shifting by five) gives the
number of bytes to move ahead for each iteration. */
uint32_t skip = 32;
const uint8_t* next_ip = ip;
const uint8_t* candidate;
BROTLI_DCHECK(next_emit < ip);
trawl:
do {
uint32_t hash = next_hash;
uint32_t bytes_between_hash_lookups = skip++ >> 5;
BROTLI_DCHECK(hash == Hash(next_ip, shift));
ip = next_ip;
next_ip = ip + bytes_between_hash_lookups;
if (BROTLI_PREDICT_FALSE(next_ip > ip_limit)) {
goto emit_remainder;
}
next_hash = Hash(next_ip, shift);
candidate = ip - last_distance;
if (IsMatch(ip, candidate)) {
if (BROTLI_PREDICT_TRUE(candidate < ip)) {
table[hash] = (int)(ip - base_ip);
break;
}
}
candidate = base_ip + table[hash];
BROTLI_DCHECK(candidate >= base_ip);
BROTLI_DCHECK(candidate < ip);
table[hash] = (int)(ip - base_ip);
} while (BROTLI_PREDICT_TRUE(!IsMatch(ip, candidate)));
/* Check copy distance. If candidate is not feasible, continue search.
Checking is done outside of hot loop to reduce overhead. */
if (ip - candidate > MAX_DISTANCE) goto trawl;
/* Step 2: Emit the found match together with the literal bytes from
"next_emit" to the bit stream, and then see if we can find a next match
immediately afterwards. Repeat until we find no match for the input
without emitting some literal bytes. */
{
/* We have a 5-byte match at ip, and we need to emit bytes in
[next_emit, ip). */
const uint8_t* base = ip;
size_t matched = 5 + FindMatchLengthWithLimit(
candidate + 5, ip + 5, (size_t)(ip_end - ip) - 5);
int distance = (int)(base - candidate); /* > 0 */
size_t insert = (size_t)(base - next_emit);
ip += matched;
BROTLI_LOG(("[CompressFragment] pos = %d insert = %lu copy = %d\n",
(int)(next_emit - base_ip), (unsigned long)insert, 2));
BROTLI_DCHECK(0 == memcmp(base, candidate, matched));
if (BROTLI_PREDICT_TRUE(insert < 6210)) {
EmitInsertLen(insert, cmd_depth, cmd_bits, cmd_histo,
storage_ix, storage);
} else if (ShouldUseUncompressedMode(metablock_start, next_emit, insert,
literal_ratio)) {
EmitUncompressedMetaBlock(metablock_start, base, mlen_storage_ix - 3,
storage_ix, storage);
input_size -= (size_t)(base - input);
input = base;
next_emit = input;
goto next_block;
} else {
EmitLongInsertLen(insert, cmd_depth, cmd_bits, cmd_histo,
storage_ix, storage);
}
EmitLiterals(next_emit, insert, lit_depth, lit_bits,
storage_ix, storage);
if (distance == last_distance) {
BrotliWriteBits(cmd_depth[64], cmd_bits[64], storage_ix, storage);
++cmd_histo[64];
} else {
EmitDistance((size_t)distance, cmd_depth, cmd_bits,
cmd_histo, storage_ix, storage);
last_distance = distance;
}
EmitCopyLenLastDistance(matched, cmd_depth, cmd_bits, cmd_histo,
storage_ix, storage);
BROTLI_LOG(("[CompressFragment] pos = %d distance = %d\n"
"[CompressFragment] pos = %d insert = %d copy = %d\n"
"[CompressFragment] pos = %d distance = %d\n",
(int)(base - base_ip), (int)distance,
(int)(base - base_ip) + 2, 0, (int)matched - 2,
(int)(base - base_ip) + 2, (int)distance));
next_emit = ip;
if (BROTLI_PREDICT_FALSE(ip >= ip_limit)) {
goto emit_remainder;
}
/* We could immediately start working at ip now, but to improve
compression we first update "table" with the hashes of some positions
within the last copy. */
{
uint64_t input_bytes = BROTLI_UNALIGNED_LOAD64LE(ip - 3);
uint32_t prev_hash = HashBytesAtOffset(input_bytes, 0, shift);
uint32_t cur_hash = HashBytesAtOffset(input_bytes, 3, shift);
table[prev_hash] = (int)(ip - base_ip - 3);
prev_hash = HashBytesAtOffset(input_bytes, 1, shift);
table[prev_hash] = (int)(ip - base_ip - 2);
prev_hash = HashBytesAtOffset(input_bytes, 2, shift);
table[prev_hash] = (int)(ip - base_ip - 1);
candidate = base_ip + table[cur_hash];
table[cur_hash] = (int)(ip - base_ip);
}
}
while (IsMatch(ip, candidate)) {
/* We have a 5-byte match at ip, and no need to emit any literal bytes
prior to ip. */
const uint8_t* base = ip;
size_t matched = 5 + FindMatchLengthWithLimit(
candidate + 5, ip + 5, (size_t)(ip_end - ip) - 5);
if (ip - candidate > MAX_DISTANCE) break;
ip += matched;
last_distance = (int)(base - candidate); /* > 0 */
BROTLI_DCHECK(0 == memcmp(base, candidate, matched));
EmitCopyLen(matched, cmd_depth, cmd_bits, cmd_histo,
storage_ix, storage);
EmitDistance((size_t)last_distance, cmd_depth, cmd_bits,
cmd_histo, storage_ix, storage);
BROTLI_LOG(("[CompressFragment] pos = %d insert = %d copy = %d\n"
"[CompressFragment] pos = %d distance = %d\n",
(int)(base - base_ip), 0, (int)matched,
(int)(base - base_ip), (int)last_distance));
next_emit = ip;
if (BROTLI_PREDICT_FALSE(ip >= ip_limit)) {
goto emit_remainder;
}
/* We could immediately start working at ip now, but to improve
compression we first update "table" with the hashes of some positions
within the last copy. */
{
uint64_t input_bytes = BROTLI_UNALIGNED_LOAD64LE(ip - 3);
uint32_t prev_hash = HashBytesAtOffset(input_bytes, 0, shift);
uint32_t cur_hash = HashBytesAtOffset(input_bytes, 3, shift);
table[prev_hash] = (int)(ip - base_ip - 3);
prev_hash = HashBytesAtOffset(input_bytes, 1, shift);
table[prev_hash] = (int)(ip - base_ip - 2);
prev_hash = HashBytesAtOffset(input_bytes, 2, shift);
table[prev_hash] = (int)(ip - base_ip - 1);
candidate = base_ip + table[cur_hash];
table[cur_hash] = (int)(ip - base_ip);
}
}
next_hash = Hash(++ip, shift);
}
}
emit_remainder:
BROTLI_DCHECK(next_emit <= ip_end);
input += block_size;
input_size -= block_size;
block_size = BROTLI_MIN(size_t, input_size, kMergeBlockSize);
/* Decide if we want to continue this meta-block instead of emitting the
last insert-only command. */
if (input_size > 0 &&
total_block_size + block_size <= (1 << 20) &&
ShouldMergeBlock(s, input, block_size, lit_depth)) {
BROTLI_DCHECK(total_block_size > (1 << 16));
/* Update the size of the current meta-block and continue emitting commands.
We can do this because the current size and the new size both have 5
nibbles. */
total_block_size += block_size;
UpdateBits(20, (uint32_t)(total_block_size - 1), mlen_storage_ix, storage);
goto emit_commands;
}
/* Emit the remaining bytes as literals. */
if (next_emit < ip_end) {
const size_t insert = (size_t)(ip_end - next_emit);
BROTLI_LOG(("[CompressFragment] pos = %d insert = %lu copy = %d\n",
(int)(next_emit - base_ip), (unsigned long)insert, 2));
if (BROTLI_PREDICT_TRUE(insert < 6210)) {
EmitInsertLen(insert, cmd_depth, cmd_bits, cmd_histo,
storage_ix, storage);
EmitLiterals(next_emit, insert, lit_depth, lit_bits, storage_ix, storage);
} else if (ShouldUseUncompressedMode(metablock_start, next_emit, insert,
literal_ratio)) {
EmitUncompressedMetaBlock(metablock_start, ip_end, mlen_storage_ix - 3,
storage_ix, storage);
} else {
EmitLongInsertLen(insert, cmd_depth, cmd_bits, cmd_histo,
storage_ix, storage);
EmitLiterals(next_emit, insert, lit_depth, lit_bits,
storage_ix, storage);
}
}
next_emit = ip_end;
next_block:
/* If we have more data, write a new meta-block header and prefix codes and
then continue emitting commands. */
if (input_size > 0) {
metablock_start = input;
block_size = BROTLI_MIN(size_t, input_size, kFirstBlockSize);
total_block_size = block_size;
/* Save the bit position of the MLEN field of the meta-block header, so that
we can update it later if we decide to extend this meta-block. */
mlen_storage_ix = *storage_ix + 3;
BrotliStoreMetaBlockHeader(block_size, 0, storage_ix, storage);
/* No block splits, no contexts. */
BrotliWriteBits(13, 0, storage_ix, storage);
literal_ratio = BuildAndStoreLiteralPrefixCode(
s, input, block_size, lit_depth, lit_bits, storage_ix, storage);
BuildAndStoreCommandPrefixCode(s, storage_ix, storage);
goto emit_commands;
}
if (!is_last) {
/* If this is not the last block, update the command and distance prefix
codes for the next block and store the compressed forms. */
s->cmd_code[0] = 0;
s->cmd_code_numbits = 0;
BuildAndStoreCommandPrefixCode(s, &s->cmd_code_numbits, s->cmd_code);
}
}
#define FOR_TABLE_BITS_(X) X(9) X(11) X(13) X(15)
#define BAKE_METHOD_PARAM_(B) \
static BROTLI_NOINLINE void BrotliCompressFragmentFastImpl ## B( \
BrotliOnePassArena* s, const uint8_t* input, size_t input_size, \
BROTLI_BOOL is_last, int* table, size_t* storage_ix, uint8_t* storage) { \
BrotliCompressFragmentFastImpl(s, input, input_size, is_last, table, B, \
storage_ix, storage); \
}
FOR_TABLE_BITS_(BAKE_METHOD_PARAM_)
#undef BAKE_METHOD_PARAM_
void BrotliCompressFragmentFast(
BrotliOnePassArena* s, const uint8_t* input, size_t input_size,
BROTLI_BOOL is_last, int* table, size_t table_size,
size_t* storage_ix, uint8_t* storage) {
const size_t initial_storage_ix = *storage_ix;
const size_t table_bits = Log2FloorNonZero(table_size);
if (input_size == 0) {
BROTLI_DCHECK(is_last);
BrotliWriteBits(1, 1, storage_ix, storage); /* islast */
BrotliWriteBits(1, 1, storage_ix, storage); /* isempty */
*storage_ix = (*storage_ix + 7u) & ~7u;
return;
}
switch (table_bits) {
#define CASE_(B) \
case B: \
BrotliCompressFragmentFastImpl ## B( \
s, input, input_size, is_last, table, storage_ix, storage);\
break;
FOR_TABLE_BITS_(CASE_)
#undef CASE_
default: BROTLI_DCHECK(0); break;
}
/* If output is larger than single uncompressed block, rewrite it. */
if (*storage_ix - initial_storage_ix > 31 + (input_size << 3)) {
EmitUncompressedMetaBlock(input, input + input_size, initial_storage_ix,
storage_ix, storage);
}
if (is_last) {
BrotliWriteBits(1, 1, storage_ix, storage); /* islast */
BrotliWriteBits(1, 1, storage_ix, storage); /* isempty */
*storage_ix = (*storage_ix + 7u) & ~7u;
}
}
#undef FOR_TABLE_BITS_
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

84
c/enc/compress_fragment.h Normal file
View File

@@ -0,0 +1,84 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function for fast encoding of an input fragment, independently from the input
history. This function uses one-pass processing: when we find a backward
match, we immediately emit the corresponding command and literal codes to
the bit stream. */
#ifndef BROTLI_ENC_COMPRESS_FRAGMENT_H_
#define BROTLI_ENC_COMPRESS_FRAGMENT_H_
#include "../common/constants.h"
#include "../common/platform.h"
#include "entropy_encode.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
typedef struct BrotliOnePassArena {
uint8_t lit_depth[256];
uint16_t lit_bits[256];
/* Command and distance prefix codes (each 64 symbols, stored back-to-back)
used for the next block. The command prefix code is over a smaller alphabet
with the following 64 symbols:
0 - 15: insert length code 0, copy length code 0 - 15, same distance
16 - 39: insert length code 0, copy length code 0 - 23
40 - 63: insert length code 0 - 23, copy length code 0
Note that symbols 16 and 40 represent the same code in the full alphabet,
but we do not use either of them. */
uint8_t cmd_depth[128];
uint16_t cmd_bits[128];
uint32_t cmd_histo[128];
/* The compressed form of the command and distance prefix codes for the next
block. */
uint8_t cmd_code[512];
size_t cmd_code_numbits;
HuffmanTree tree[2 * BROTLI_NUM_LITERAL_SYMBOLS + 1];
uint32_t histogram[256];
uint8_t tmp_depth[BROTLI_NUM_COMMAND_SYMBOLS];
uint16_t tmp_bits[64];
} BrotliOnePassArena;
/* Compresses "input" string to the "*storage" buffer as one or more complete
meta-blocks, and updates the "*storage_ix" bit position.
If "is_last" is 1, emits an additional empty last meta-block.
"cmd_depth" and "cmd_bits" contain the command and distance prefix codes
(see comment in encode.h) used for the encoding of this input fragment.
If "is_last" is 0, they are updated to reflect the statistics
of this input fragment, to be used for the encoding of the next fragment.
"*cmd_code_numbits" is the number of bits of the compressed representation
of the command and distance prefix codes, and "cmd_code" is an array of
at least "(*cmd_code_numbits + 7) >> 3" size that contains the compressed
command and distance prefix codes. If "is_last" is 0, these are also
updated to represent the updated "cmd_depth" and "cmd_bits".
REQUIRES: "input_size" is greater than zero, or "is_last" is 1.
REQUIRES: "input_size" is less or equal to maximal metablock size (1 << 24).
REQUIRES: All elements in "table[0..table_size-1]" are initialized to zero.
REQUIRES: "table_size" is an odd (9, 11, 13, 15) power of two
OUTPUT: maximal copy distance <= |input_size|
OUTPUT: maximal copy distance <= BROTLI_MAX_BACKWARD_LIMIT(18) */
BROTLI_INTERNAL void BrotliCompressFragmentFast(BrotliOnePassArena* s,
const uint8_t* input,
size_t input_size,
BROTLI_BOOL is_last,
int* table, size_t table_size,
size_t* storage_ix,
uint8_t* storage);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_COMPRESS_FRAGMENT_H_ */

View File

@@ -0,0 +1,647 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function for fast encoding of an input fragment, independently from the input
history. This function uses two-pass processing: in the first pass we save
the found backward matches and literal bytes into a buffer, and in the
second pass we emit them into the bit stream using prefix codes built based
on the actual command and literal byte histograms. */
#include "compress_fragment_two_pass.h"
#include "../common/constants.h"
#include "../common/platform.h"
#include "bit_cost.h"
#include "brotli_bit_stream.h"
#include "entropy_encode.h"
#include "fast_log.h"
#include "find_match_length.h"
#include "hash_base.h"
#include "write_bits.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#define MAX_DISTANCE (long)BROTLI_MAX_BACKWARD_LIMIT(18)
static BROTLI_INLINE uint32_t Hash(const uint8_t* p,
size_t shift, size_t length) {
const uint64_t h =
(BROTLI_UNALIGNED_LOAD64LE(p) << ((8 - length) * 8)) * kHashMul32;
return (uint32_t)(h >> shift);
}
static BROTLI_INLINE uint32_t HashBytesAtOffset(uint64_t v, size_t offset,
size_t shift, size_t length) {
BROTLI_DCHECK(offset <= 8 - length);
{
const uint64_t h = ((v >> (8 * offset)) << ((8 - length) * 8)) * kHashMul32;
return (uint32_t)(h >> shift);
}
}
static BROTLI_INLINE BROTLI_BOOL IsMatch(const uint8_t* p1, const uint8_t* p2,
size_t length) {
if (BrotliUnalignedRead32(p1) == BrotliUnalignedRead32(p2)) {
if (length == 4) return BROTLI_TRUE;
return TO_BROTLI_BOOL(p1[4] == p2[4] && p1[5] == p2[5]);
}
return BROTLI_FALSE;
}
/* Builds a command and distance prefix code (each 64 symbols) into "depth" and
"bits" based on "histogram" and stores it into the bit stream. */
static void BuildAndStoreCommandPrefixCode(BrotliTwoPassArena* s,
size_t* storage_ix,
uint8_t* storage) {
/* Tree size for building a tree over 64 symbols is 2 * 64 + 1. */
/* TODO(eustas): initialize once. */
memset(s->tmp_depth, 0, sizeof(s->tmp_depth));
BrotliCreateHuffmanTree(s->cmd_histo, 64, 15, s->tmp_tree, s->cmd_depth);
BrotliCreateHuffmanTree(&s->cmd_histo[64], 64, 14, s->tmp_tree,
&s->cmd_depth[64]);
/* We have to jump through a few hoops here in order to compute
the command bits because the symbols are in a different order than in
the full alphabet. This looks complicated, but having the symbols
in this order in the command bits saves a few branches in the Emit*
functions. */
memcpy(s->tmp_depth, s->cmd_depth + 24, 24);
memcpy(s->tmp_depth + 24, s->cmd_depth, 8);
memcpy(s->tmp_depth + 32, s->cmd_depth + 48, 8);
memcpy(s->tmp_depth + 40, s->cmd_depth + 8, 8);
memcpy(s->tmp_depth + 48, s->cmd_depth + 56, 8);
memcpy(s->tmp_depth + 56, s->cmd_depth + 16, 8);
BrotliConvertBitDepthsToSymbols(s->tmp_depth, 64, s->tmp_bits);
memcpy(s->cmd_bits, s->tmp_bits + 24, 16);
memcpy(s->cmd_bits + 8, s->tmp_bits + 40, 16);
memcpy(s->cmd_bits + 16, s->tmp_bits + 56, 16);
memcpy(s->cmd_bits + 24, s->tmp_bits, 48);
memcpy(s->cmd_bits + 48, s->tmp_bits + 32, 16);
memcpy(s->cmd_bits + 56, s->tmp_bits + 48, 16);
BrotliConvertBitDepthsToSymbols(&s->cmd_depth[64], 64, &s->cmd_bits[64]);
{
/* Create the bit length array for the full command alphabet. */
size_t i;
memset(s->tmp_depth, 0, 64); /* only 64 first values were used */
memcpy(s->tmp_depth, s->cmd_depth + 24, 8);
memcpy(s->tmp_depth + 64, s->cmd_depth + 32, 8);
memcpy(s->tmp_depth + 128, s->cmd_depth + 40, 8);
memcpy(s->tmp_depth + 192, s->cmd_depth + 48, 8);
memcpy(s->tmp_depth + 384, s->cmd_depth + 56, 8);
for (i = 0; i < 8; ++i) {
s->tmp_depth[128 + 8 * i] = s->cmd_depth[i];
s->tmp_depth[256 + 8 * i] = s->cmd_depth[8 + i];
s->tmp_depth[448 + 8 * i] = s->cmd_depth[16 + i];
}
BrotliStoreHuffmanTree(s->tmp_depth, BROTLI_NUM_COMMAND_SYMBOLS,
s->tmp_tree, storage_ix, storage);
}
BrotliStoreHuffmanTree(&s->cmd_depth[64], 64, s->tmp_tree, storage_ix,
storage);
}
static BROTLI_INLINE void EmitInsertLen(
uint32_t insertlen, uint32_t** commands) {
if (insertlen < 6) {
**commands = insertlen;
} else if (insertlen < 130) {
const uint32_t tail = insertlen - 2;
const uint32_t nbits = Log2FloorNonZero(tail) - 1u;
const uint32_t prefix = tail >> nbits;
const uint32_t inscode = (nbits << 1) + prefix + 2;
const uint32_t extra = tail - (prefix << nbits);
**commands = inscode | (extra << 8);
} else if (insertlen < 2114) {
const uint32_t tail = insertlen - 66;
const uint32_t nbits = Log2FloorNonZero(tail);
const uint32_t code = nbits + 10;
const uint32_t extra = tail - (1u << nbits);
**commands = code | (extra << 8);
} else if (insertlen < 6210) {
const uint32_t extra = insertlen - 2114;
**commands = 21 | (extra << 8);
} else if (insertlen < 22594) {
const uint32_t extra = insertlen - 6210;
**commands = 22 | (extra << 8);
} else {
const uint32_t extra = insertlen - 22594;
**commands = 23 | (extra << 8);
}
++(*commands);
}
static BROTLI_INLINE void EmitCopyLen(size_t copylen, uint32_t** commands) {
if (copylen < 10) {
**commands = (uint32_t)(copylen + 38);
} else if (copylen < 134) {
const size_t tail = copylen - 6;
const size_t nbits = Log2FloorNonZero(tail) - 1;
const size_t prefix = tail >> nbits;
const size_t code = (nbits << 1) + prefix + 44;
const size_t extra = tail - (prefix << nbits);
**commands = (uint32_t)(code | (extra << 8));
} else if (copylen < 2118) {
const size_t tail = copylen - 70;
const size_t nbits = Log2FloorNonZero(tail);
const size_t code = nbits + 52;
const size_t extra = tail - ((size_t)1 << nbits);
**commands = (uint32_t)(code | (extra << 8));
} else {
const size_t extra = copylen - 2118;
**commands = (uint32_t)(63 | (extra << 8));
}
++(*commands);
}
static BROTLI_INLINE void EmitCopyLenLastDistance(
size_t copylen, uint32_t** commands) {
if (copylen < 12) {
**commands = (uint32_t)(copylen + 20);
++(*commands);
} else if (copylen < 72) {
const size_t tail = copylen - 8;
const size_t nbits = Log2FloorNonZero(tail) - 1;
const size_t prefix = tail >> nbits;
const size_t code = (nbits << 1) + prefix + 28;
const size_t extra = tail - (prefix << nbits);
**commands = (uint32_t)(code | (extra << 8));
++(*commands);
} else if (copylen < 136) {
const size_t tail = copylen - 8;
const size_t code = (tail >> 5) + 54;
const size_t extra = tail & 31;
**commands = (uint32_t)(code | (extra << 8));
++(*commands);
**commands = 64;
++(*commands);
} else if (copylen < 2120) {
const size_t tail = copylen - 72;
const size_t nbits = Log2FloorNonZero(tail);
const size_t code = nbits + 52;
const size_t extra = tail - ((size_t)1 << nbits);
**commands = (uint32_t)(code | (extra << 8));
++(*commands);
**commands = 64;
++(*commands);
} else {
const size_t extra = copylen - 2120;
**commands = (uint32_t)(63 | (extra << 8));
++(*commands);
**commands = 64;
++(*commands);
}
}
static BROTLI_INLINE void EmitDistance(uint32_t distance, uint32_t** commands) {
uint32_t d = distance + 3;
uint32_t nbits = Log2FloorNonZero(d) - 1;
const uint32_t prefix = (d >> nbits) & 1;
const uint32_t offset = (2 + prefix) << nbits;
const uint32_t distcode = 2 * (nbits - 1) + prefix + 80;
uint32_t extra = d - offset;
**commands = distcode | (extra << 8);
++(*commands);
}
/* REQUIRES: len <= 1 << 24. */
static void BrotliStoreMetaBlockHeader(
size_t len, BROTLI_BOOL is_uncompressed, size_t* storage_ix,
uint8_t* storage) {
size_t nibbles = 6;
/* ISLAST */
BrotliWriteBits(1, 0, storage_ix, storage);
if (len <= (1U << 16)) {
nibbles = 4;
} else if (len <= (1U << 20)) {
nibbles = 5;
}
BrotliWriteBits(2, nibbles - 4, storage_ix, storage);
BrotliWriteBits(nibbles * 4, len - 1, storage_ix, storage);
/* ISUNCOMPRESSED */
BrotliWriteBits(1, (uint64_t)is_uncompressed, storage_ix, storage);
}
static BROTLI_INLINE void CreateCommands(const uint8_t* input,
size_t block_size, size_t input_size, const uint8_t* base_ip, int* table,
size_t table_bits, size_t min_match,
uint8_t** literals, uint32_t** commands) {
/* "ip" is the input pointer. */
const uint8_t* ip = input;
const size_t shift = 64u - table_bits;
const uint8_t* ip_end = input + block_size;
/* "next_emit" is a pointer to the first byte that is not covered by a
previous copy. Bytes between "next_emit" and the start of the next copy or
the end of the input will be emitted as literal bytes. */
const uint8_t* next_emit = input;
int last_distance = -1;
const size_t kInputMarginBytes = BROTLI_WINDOW_GAP;
if (BROTLI_PREDICT_TRUE(block_size >= kInputMarginBytes)) {
/* For the last block, we need to keep a 16 bytes margin so that we can be
sure that all distances are at most window size - 16.
For all other blocks, we only need to keep a margin of 5 bytes so that
we don't go over the block size with a copy. */
const size_t len_limit = BROTLI_MIN(size_t, block_size - min_match,
input_size - kInputMarginBytes);
const uint8_t* ip_limit = input + len_limit;
uint32_t next_hash;
for (next_hash = Hash(++ip, shift, min_match); ; ) {
/* Step 1: Scan forward in the input looking for a 6-byte-long match.
If we get close to exhausting the input then goto emit_remainder.
Heuristic match skipping: If 32 bytes are scanned with no matches
found, start looking only at every other byte. If 32 more bytes are
scanned, look at every third byte, etc.. When a match is found,
immediately go back to looking at every byte. This is a small loss
(~5% performance, ~0.1% density) for compressible data due to more
bookkeeping, but for non-compressible data (such as JPEG) it's a huge
win since the compressor quickly "realizes" the data is incompressible
and doesn't bother looking for matches everywhere.
The "skip" variable keeps track of how many bytes there are since the
last match; dividing it by 32 (ie. right-shifting by five) gives the
number of bytes to move ahead for each iteration. */
uint32_t skip = 32;
const uint8_t* next_ip = ip;
const uint8_t* candidate;
BROTLI_DCHECK(next_emit < ip);
trawl:
do {
uint32_t hash = next_hash;
uint32_t bytes_between_hash_lookups = skip++ >> 5;
ip = next_ip;
BROTLI_DCHECK(hash == Hash(ip, shift, min_match));
next_ip = ip + bytes_between_hash_lookups;
if (BROTLI_PREDICT_FALSE(next_ip > ip_limit)) {
goto emit_remainder;
}
next_hash = Hash(next_ip, shift, min_match);
candidate = ip - last_distance;
if (IsMatch(ip, candidate, min_match)) {
if (BROTLI_PREDICT_TRUE(candidate < ip)) {
table[hash] = (int)(ip - base_ip);
break;
}
}
candidate = base_ip + table[hash];
BROTLI_DCHECK(candidate >= base_ip);
BROTLI_DCHECK(candidate < ip);
table[hash] = (int)(ip - base_ip);
} while (BROTLI_PREDICT_TRUE(!IsMatch(ip, candidate, min_match)));
/* Check copy distance. If candidate is not feasible, continue search.
Checking is done outside of hot loop to reduce overhead. */
if (ip - candidate > MAX_DISTANCE) goto trawl;
/* Step 2: Emit the found match together with the literal bytes from
"next_emit", and then see if we can find a next match immediately
afterwards. Repeat until we find no match for the input
without emitting some literal bytes. */
{
/* We have a 6-byte match at ip, and we need to emit bytes in
[next_emit, ip). */
const uint8_t* base = ip;
size_t matched = min_match + FindMatchLengthWithLimit(
candidate + min_match, ip + min_match,
(size_t)(ip_end - ip) - min_match);
int distance = (int)(base - candidate); /* > 0 */
int insert = (int)(base - next_emit);
ip += matched;
BROTLI_DCHECK(0 == memcmp(base, candidate, matched));
EmitInsertLen((uint32_t)insert, commands);
BROTLI_LOG(("[CompressFragment] pos = %d insert = %d copy = %d\n",
(int)(next_emit - base_ip), insert, 2));
memcpy(*literals, next_emit, (size_t)insert);
*literals += insert;
if (distance == last_distance) {
**commands = 64;
++(*commands);
} else {
EmitDistance((uint32_t)distance, commands);
last_distance = distance;
}
EmitCopyLenLastDistance(matched, commands);
BROTLI_LOG(("[CompressFragment] pos = %d distance = %d\n"
"[CompressFragment] pos = %d insert = %d copy = %d\n"
"[CompressFragment] pos = %d distance = %d\n",
(int)(base - base_ip), (int)distance,
(int)(base - base_ip) + 2, 0, (int)matched - 2,
(int)(base - base_ip) + 2, (int)distance));
next_emit = ip;
if (BROTLI_PREDICT_FALSE(ip >= ip_limit)) {
goto emit_remainder;
}
{
/* We could immediately start working at ip now, but to improve
compression we first update "table" with the hashes of some
positions within the last copy. */
uint64_t input_bytes;
uint32_t cur_hash;
uint32_t prev_hash;
if (min_match == 4) {
input_bytes = BROTLI_UNALIGNED_LOAD64LE(ip - 3);
cur_hash = HashBytesAtOffset(input_bytes, 3, shift, min_match);
prev_hash = HashBytesAtOffset(input_bytes, 0, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 3);
prev_hash = HashBytesAtOffset(input_bytes, 1, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 2);
prev_hash = HashBytesAtOffset(input_bytes, 0, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 1);
} else {
input_bytes = BROTLI_UNALIGNED_LOAD64LE(ip - 5);
prev_hash = HashBytesAtOffset(input_bytes, 0, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 5);
prev_hash = HashBytesAtOffset(input_bytes, 1, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 4);
prev_hash = HashBytesAtOffset(input_bytes, 2, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 3);
input_bytes = BROTLI_UNALIGNED_LOAD64LE(ip - 2);
cur_hash = HashBytesAtOffset(input_bytes, 2, shift, min_match);
prev_hash = HashBytesAtOffset(input_bytes, 0, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 2);
prev_hash = HashBytesAtOffset(input_bytes, 1, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 1);
}
candidate = base_ip + table[cur_hash];
table[cur_hash] = (int)(ip - base_ip);
}
}
while (ip - candidate <= MAX_DISTANCE &&
IsMatch(ip, candidate, min_match)) {
/* We have a 6-byte match at ip, and no need to emit any
literal bytes prior to ip. */
const uint8_t* base = ip;
size_t matched = min_match + FindMatchLengthWithLimit(
candidate + min_match, ip + min_match,
(size_t)(ip_end - ip) - min_match);
ip += matched;
last_distance = (int)(base - candidate); /* > 0 */
BROTLI_DCHECK(0 == memcmp(base, candidate, matched));
EmitCopyLen(matched, commands);
EmitDistance((uint32_t)last_distance, commands);
BROTLI_LOG(("[CompressFragment] pos = %d insert = %d copy = %d\n"
"[CompressFragment] pos = %d distance = %d\n",
(int)(base - base_ip), 0, (int)matched,
(int)(base - base_ip), (int)last_distance));
next_emit = ip;
if (BROTLI_PREDICT_FALSE(ip >= ip_limit)) {
goto emit_remainder;
}
{
/* We could immediately start working at ip now, but to improve
compression we first update "table" with the hashes of some
positions within the last copy. */
uint64_t input_bytes;
uint32_t cur_hash;
uint32_t prev_hash;
if (min_match == 4) {
input_bytes = BROTLI_UNALIGNED_LOAD64LE(ip - 3);
cur_hash = HashBytesAtOffset(input_bytes, 3, shift, min_match);
prev_hash = HashBytesAtOffset(input_bytes, 0, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 3);
prev_hash = HashBytesAtOffset(input_bytes, 1, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 2);
prev_hash = HashBytesAtOffset(input_bytes, 2, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 1);
} else {
input_bytes = BROTLI_UNALIGNED_LOAD64LE(ip - 5);
prev_hash = HashBytesAtOffset(input_bytes, 0, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 5);
prev_hash = HashBytesAtOffset(input_bytes, 1, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 4);
prev_hash = HashBytesAtOffset(input_bytes, 2, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 3);
input_bytes = BROTLI_UNALIGNED_LOAD64LE(ip - 2);
cur_hash = HashBytesAtOffset(input_bytes, 2, shift, min_match);
prev_hash = HashBytesAtOffset(input_bytes, 0, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 2);
prev_hash = HashBytesAtOffset(input_bytes, 1, shift, min_match);
table[prev_hash] = (int)(ip - base_ip - 1);
}
candidate = base_ip + table[cur_hash];
table[cur_hash] = (int)(ip - base_ip);
}
}
next_hash = Hash(++ip, shift, min_match);
}
}
emit_remainder:
BROTLI_DCHECK(next_emit <= ip_end);
/* Emit the remaining bytes as literals. */
if (next_emit < ip_end) {
const uint32_t insert = (uint32_t)(ip_end - next_emit);
EmitInsertLen(insert, commands);
BROTLI_LOG(("[CompressFragment] pos = %d insert = %d copy = %d\n",
(int)(next_emit - base_ip), insert, 2));
memcpy(*literals, next_emit, insert);
*literals += insert;
}
}
static void StoreCommands(BrotliTwoPassArena* s,
const uint8_t* literals, const size_t num_literals,
const uint32_t* commands, const size_t num_commands,
size_t* storage_ix, uint8_t* storage) {
static const BROTLI_MODEL("small") uint32_t kNumExtraBits[128] = {
0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5,
6, 7, 8, 9, 10, 12, 14, 24, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 2, 2, 3, 3, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 7, 8, 9, 10, 24,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8,
9, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 16, 16,
17, 17, 18, 18, 19, 19, 20, 20, 21, 21, 22, 22, 23, 23, 24, 24,
};
static const BROTLI_MODEL("small") uint32_t kInsertOffset[24] = {
0, 1, 2, 3, 4, 5, 6, 8, 10, 14, 18, 26,
34, 50, 66, 98, 130, 194, 322, 578, 1090, 2114, 6210, 22594,
};
size_t i;
memset(s->lit_histo, 0, sizeof(s->lit_histo));
/* TODO(eustas): is that necessary? */
memset(s->cmd_depth, 0, sizeof(s->cmd_depth));
/* TODO(eustas): is that necessary? */
memset(s->cmd_bits, 0, sizeof(s->cmd_bits));
memset(s->cmd_histo, 0, sizeof(s->cmd_histo));
for (i = 0; i < num_literals; ++i) {
++s->lit_histo[literals[i]];
}
BrotliBuildAndStoreHuffmanTreeFast(s->tmp_tree, s->lit_histo, num_literals,
/* max_bits = */ 8, s->lit_depth,
s->lit_bits, storage_ix, storage);
for (i = 0; i < num_commands; ++i) {
const uint32_t code = commands[i] & 0xFF;
BROTLI_DCHECK(code < 128);
++s->cmd_histo[code];
}
s->cmd_histo[1] += 1;
s->cmd_histo[2] += 1;
s->cmd_histo[64] += 1;
s->cmd_histo[84] += 1;
BuildAndStoreCommandPrefixCode(s, storage_ix, storage);
for (i = 0; i < num_commands; ++i) {
const uint32_t cmd = commands[i];
const uint32_t code = cmd & 0xFF;
const uint32_t extra = cmd >> 8;
BROTLI_DCHECK(code < 128);
BrotliWriteBits(s->cmd_depth[code], s->cmd_bits[code], storage_ix, storage);
BrotliWriteBits(kNumExtraBits[code], extra, storage_ix, storage);
if (code < 24) {
const uint32_t insert = kInsertOffset[code] + extra;
uint32_t j;
for (j = 0; j < insert; ++j) {
const uint8_t lit = *literals;
BrotliWriteBits(s->lit_depth[lit], s->lit_bits[lit], storage_ix,
storage);
++literals;
}
}
}
}
/* Acceptable loss for uncompressible speedup is 2% */
#define MIN_RATIO 0.98
#define SAMPLE_RATE 43
static BROTLI_BOOL ShouldCompress(BrotliTwoPassArena* s,
const uint8_t* input, size_t input_size, size_t num_literals) {
double corpus_size = (double)input_size;
if ((double)num_literals < MIN_RATIO * corpus_size) {
return BROTLI_TRUE;
} else {
const double max_total_bit_cost = corpus_size * 8 * MIN_RATIO / SAMPLE_RATE;
size_t i;
memset(s->lit_histo, 0, sizeof(s->lit_histo));
for (i = 0; i < input_size; i += SAMPLE_RATE) {
++s->lit_histo[input[i]];
}
return TO_BROTLI_BOOL(
BrotliBitsEntropy(s->lit_histo, 256) < max_total_bit_cost);
}
}
static void RewindBitPosition(const size_t new_storage_ix,
size_t* storage_ix, uint8_t* storage) {
const size_t bitpos = new_storage_ix & 7;
const size_t mask = (1u << bitpos) - 1;
storage[new_storage_ix >> 3] &= (uint8_t)mask;
*storage_ix = new_storage_ix;
}
static void EmitUncompressedMetaBlock(const uint8_t* input, size_t input_size,
size_t* storage_ix, uint8_t* storage) {
BrotliStoreMetaBlockHeader(input_size, 1, storage_ix, storage);
*storage_ix = (*storage_ix + 7u) & ~7u;
memcpy(&storage[*storage_ix >> 3], input, input_size);
*storage_ix += input_size << 3;
storage[*storage_ix >> 3] = 0;
}
static BROTLI_INLINE void BrotliCompressFragmentTwoPassImpl(
BrotliTwoPassArena* s, const uint8_t* input, size_t input_size,
BROTLI_BOOL is_last, uint32_t* command_buf, uint8_t* literal_buf,
int* table, size_t table_bits, size_t min_match,
size_t* storage_ix, uint8_t* storage) {
/* Save the start of the first block for position and distance computations.
*/
const uint8_t* base_ip = input;
BROTLI_UNUSED(is_last);
while (input_size > 0) {
size_t block_size =
BROTLI_MIN(size_t, input_size, kCompressFragmentTwoPassBlockSize);
uint32_t* commands = command_buf;
uint8_t* literals = literal_buf;
size_t num_literals;
CreateCommands(input, block_size, input_size, base_ip, table,
table_bits, min_match, &literals, &commands);
num_literals = (size_t)(literals - literal_buf);
if (ShouldCompress(s, input, block_size, num_literals)) {
const size_t num_commands = (size_t)(commands - command_buf);
BrotliStoreMetaBlockHeader(block_size, 0, storage_ix, storage);
/* No block splits, no contexts. */
BrotliWriteBits(13, 0, storage_ix, storage);
StoreCommands(s, literal_buf, num_literals, command_buf, num_commands,
storage_ix, storage);
} else {
/* Since we did not find many backward references and the entropy of
the data is close to 8 bits, we can simply emit an uncompressed block.
This makes compression speed of uncompressible data about 3x faster. */
EmitUncompressedMetaBlock(input, block_size, storage_ix, storage);
}
input += block_size;
input_size -= block_size;
}
}
#define FOR_TABLE_BITS_(X) \
X(8) X(9) X(10) X(11) X(12) X(13) X(14) X(15) X(16) X(17)
#define BAKE_METHOD_PARAM_(B) \
static BROTLI_NOINLINE void BrotliCompressFragmentTwoPassImpl ## B( \
BrotliTwoPassArena* s, const uint8_t* input, size_t input_size, \
BROTLI_BOOL is_last, uint32_t* command_buf, uint8_t* literal_buf, \
int* table, size_t* storage_ix, uint8_t* storage) { \
size_t min_match = (B <= 15) ? 4 : 6; \
BrotliCompressFragmentTwoPassImpl(s, input, input_size, is_last, command_buf,\
literal_buf, table, B, min_match, storage_ix, storage); \
}
FOR_TABLE_BITS_(BAKE_METHOD_PARAM_)
#undef BAKE_METHOD_PARAM_
void BrotliCompressFragmentTwoPass(
BrotliTwoPassArena* s, const uint8_t* input, size_t input_size,
BROTLI_BOOL is_last, uint32_t* command_buf, uint8_t* literal_buf,
int* table, size_t table_size, size_t* storage_ix, uint8_t* storage) {
const size_t initial_storage_ix = *storage_ix;
const size_t table_bits = Log2FloorNonZero(table_size);
switch (table_bits) {
#define CASE_(B) \
case B: \
BrotliCompressFragmentTwoPassImpl ## B( \
s, input, input_size, is_last, command_buf, \
literal_buf, table, storage_ix, storage); \
break;
FOR_TABLE_BITS_(CASE_)
#undef CASE_
default: BROTLI_DCHECK(0); break;
}
/* If output is larger than single uncompressed block, rewrite it. */
if (*storage_ix - initial_storage_ix > 31 + (input_size << 3)) {
RewindBitPosition(initial_storage_ix, storage_ix, storage);
EmitUncompressedMetaBlock(input, input_size, storage_ix, storage);
}
if (is_last) {
BrotliWriteBits(1, 1, storage_ix, storage); /* islast */
BrotliWriteBits(1, 1, storage_ix, storage); /* isempty */
*storage_ix = (*storage_ix + 7u) & ~7u;
}
}
#undef FOR_TABLE_BITS_
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

View File

@@ -0,0 +1,70 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function for fast encoding of an input fragment, independently from the input
history. This function uses two-pass processing: in the first pass we save
the found backward matches and literal bytes into a buffer, and in the
second pass we emit them into the bit stream using prefix codes built based
on the actual command and literal byte histograms. */
#ifndef BROTLI_ENC_COMPRESS_FRAGMENT_TWO_PASS_H_
#define BROTLI_ENC_COMPRESS_FRAGMENT_TWO_PASS_H_
#include "../common/constants.h"
#include "../common/platform.h"
#include "entropy_encode.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* TODO(eustas): turn to macro. */
static const size_t kCompressFragmentTwoPassBlockSize = 1 << 17;
typedef struct BrotliTwoPassArena {
uint32_t lit_histo[256];
uint8_t lit_depth[256];
uint16_t lit_bits[256];
uint32_t cmd_histo[128];
uint8_t cmd_depth[128];
uint16_t cmd_bits[128];
/* BuildAndStoreCommandPrefixCode */
HuffmanTree tmp_tree[2 * BROTLI_NUM_LITERAL_SYMBOLS + 1];
uint8_t tmp_depth[BROTLI_NUM_COMMAND_SYMBOLS];
uint16_t tmp_bits[64];
} BrotliTwoPassArena;
/* Compresses "input" string to the "*storage" buffer as one or more complete
meta-blocks, and updates the "*storage_ix" bit position.
If "is_last" is 1, emits an additional empty last meta-block.
REQUIRES: "input_size" is greater than zero, or "is_last" is 1.
REQUIRES: "input_size" is less or equal to maximal metablock size (1 << 24).
REQUIRES: "command_buf" and "literal_buf" point to at least
kCompressFragmentTwoPassBlockSize long arrays.
REQUIRES: All elements in "table[0..table_size-1]" are initialized to zero.
REQUIRES: "table_size" is a power of two
OUTPUT: maximal copy distance <= |input_size|
OUTPUT: maximal copy distance <= BROTLI_MAX_BACKWARD_LIMIT(18) */
BROTLI_INTERNAL void BrotliCompressFragmentTwoPass(BrotliTwoPassArena* s,
const uint8_t* input,
size_t input_size,
BROTLI_BOOL is_last,
uint32_t* command_buf,
uint8_t* literal_buf,
int* table,
size_t table_size,
size_t* storage_ix,
uint8_t* storage);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_COMPRESS_FRAGMENT_TWO_PASS_H_ */

145
c/enc/dictionary_hash.c Normal file
View File

@@ -0,0 +1,145 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Hash table on the 4-byte prefixes of static dictionary words. */
#include "dictionary_hash.h"
#include "../common/platform.h" /* IWYU pragma: keep */
#include "../common/static_init.h"
#if (BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_NONE)
#include "../common/dictionary.h"
#include "hash_base.h"
#endif
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#if (BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_NONE)
BROTLI_BOOL BROTLI_COLD BrotliEncoderInitDictionaryHash(
const BrotliDictionary* dict, uint16_t* words, uint8_t* lengths) {
size_t global_idx = 0;
size_t len;
size_t i;
static const uint8_t frozen_idx[1688] = {0, 0, 8, 164, 32, 56, 31, 191, 36, 4,
128, 81, 68, 132, 145, 129, 0, 0, 0, 28, 0, 8, 1, 1, 64, 3, 1, 0, 0, 0, 0, 0, 4,
64, 1, 2, 128, 0, 132, 49, 0, 0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 0, 1, 0, 36, 152,
0, 0, 0, 0, 128, 8, 0, 0, 128, 0, 0, 8, 0, 0, 64, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8,
0, 0, 0, 1, 0, 64, 133, 0, 32, 0, 0, 128, 1, 0, 0, 0, 0, 4, 4, 4, 32, 16, 130,
0, 128, 8, 0, 0, 0, 0, 0, 64, 0, 64, 0, 160, 0, 148, 53, 0, 0, 0, 0, 0, 128, 0,
130, 0, 0, 0, 8, 0, 0, 0, 0, 0, 48, 0, 0, 0, 0, 0, 0, 32, 1, 32, 129, 0, 12, 0,
1, 0, 0, 0, 0, 0, 0, 0, 16, 0, 0, 0, 16, 32, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 8,
0, 0, 2, 0, 0, 0, 0, 0, 32, 0, 0, 0, 2, 66, 128, 0, 0, 16, 0, 0, 0, 0, 64, 1, 6,
128, 8, 0, 192, 24, 32, 0, 0, 8, 4, 128, 128, 2, 160, 0, 160, 0, 64, 0, 0, 2, 0,
0, 0, 0, 0, 0, 0, 0, 0, 32, 1, 0, 0, 64, 0, 0, 0, 0, 0, 0, 32, 0, 66, 0, 2, 0,
4, 0, 8, 0, 2, 0, 0, 33, 8, 0, 0, 0, 8, 0, 128, 162, 4, 128, 0, 2, 33, 0, 160,
0, 8, 0, 64, 0, 160, 0, 129, 4, 0, 0, 32, 0, 0, 32, 0, 2, 0, 0, 0, 0, 0, 0, 128,
0, 0, 0, 0, 0, 64, 10, 0, 0, 0, 0, 32, 64, 0, 0, 0, 0, 0, 16, 0, 16, 16, 0, 0,
80, 2, 0, 0, 0, 0, 8, 0, 0, 16, 0, 8, 0, 0, 0, 8, 64, 128, 0, 0, 0, 8, 208, 0,
0, 0, 0, 0, 0, 0, 32, 0, 0, 0, 0, 0, 0, 32, 0, 8, 0, 128, 0, 0, 0, 1, 0, 0, 0,
16, 8, 1, 136, 0, 0, 36, 0, 64, 9, 0, 1, 32, 8, 0, 64, 64, 131, 16, 224, 32, 4,
0, 4, 5, 160, 0, 131, 0, 4, 96, 0, 0, 184, 192, 0, 177, 205, 96, 0, 0, 0, 0, 2,
0, 32, 0, 0, 0, 0, 0, 0, 0, 0, 64, 0, 0, 128, 0, 0, 8, 0, 0, 0, 0, 1, 4, 0, 1,
0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 4, 0, 0, 64, 69, 0, 0, 8, 2, 66, 32, 64, 0, 0, 0,
0, 0, 1, 0, 128, 17, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 12, 0, 16, 0, 0, 4, 128, 64,
0, 0, 0, 0, 0, 0, 0, 0, 224, 0, 8, 0, 0, 130, 16, 64, 128, 2, 64, 0, 0, 0, 128,
2, 192, 64, 0, 65, 0, 0, 0, 16, 0, 0, 0, 32, 4, 2, 2, 76, 0, 0, 0, 4, 72, 52,
131, 44, 76, 0, 0, 0, 0, 64, 1, 16, 148, 4, 0, 16, 10, 64, 0, 2, 0, 1, 0, 128,
64, 68, 0, 0, 0, 0, 0, 64, 144, 0, 8, 0, 2, 0, 0, 0, 0, 0, 0, 3, 64, 0, 0, 0, 0,
1, 128, 0, 0, 32, 66, 0, 0, 0, 40, 0, 18, 0, 0, 0, 0, 0, 33, 0, 0, 32, 0, 0, 32,
0, 128, 4, 64, 145, 140, 0, 0, 0, 128, 0, 2, 0, 0, 20, 0, 80, 38, 0, 0, 32, 0,
32, 64, 4, 4, 0, 4, 0, 0, 0, 129, 4, 0, 0, 144, 17, 32, 130, 16, 132, 24, 134,
0, 0, 64, 2, 5, 50, 8, 194, 33, 1, 68, 117, 1, 8, 32, 161, 54, 0, 130, 34, 0, 0,
0, 64, 128, 0, 0, 2, 0, 0, 0, 0, 32, 1, 0, 0, 0, 3, 14, 0, 0, 0, 0, 0, 16, 4, 0,
0, 0, 0, 0, 0, 0, 0, 96, 1, 24, 18, 0, 1, 128, 24, 0, 64, 0, 4, 0, 16, 128, 0,
64, 0, 0, 0, 64, 0, 8, 0, 0, 0, 0, 0, 66, 128, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 16, 0, 64, 2, 0, 0, 0, 0, 6, 0, 8, 8, 2, 0, 64, 0, 0, 0, 0, 128, 2,
2, 12, 64, 0, 64, 0, 8, 0, 128, 32, 0, 0, 10, 0, 0, 32, 0, 128, 32, 33, 8, 136,
0, 96, 64, 0, 0, 0, 0, 0, 64, 4, 16, 4, 8, 0, 0, 0, 16, 0, 2, 0, 0, 1, 128, 0,
64, 16, 0, 0, 0, 0, 0, 0, 0, 0, 8, 0, 0, 2, 0, 16, 0, 4, 0, 8, 0, 0, 0, 0, 0,
20, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 136, 0, 0, 0, 0, 0, 8, 0,
0, 0, 0, 0, 2, 0, 0, 0, 64, 0, 0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 128, 0, 0,
0, 0, 4, 0, 0, 0, 0, 65, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 2, 128, 0, 0, 0, 8, 2, 0, 0, 128, 0, 16, 2, 0, 0, 4, 0, 32, 0, 0, 1,
4, 64, 64, 0, 4, 0, 1, 0, 16, 0, 32, 68, 4, 4, 65, 10, 0, 20, 37, 18, 1, 148, 0,
32, 128, 3, 8, 0, 64, 0, 0, 0, 0, 0, 0, 4, 0, 16, 1, 128, 0, 0, 0, 128, 16, 0,
0, 0, 0, 1, 128, 0, 0, 128, 64, 128, 64, 0, 130, 0, 164, 8, 0, 0, 1, 64, 128, 0,
18, 0, 2, 150, 0, 8, 0, 0, 64, 0, 81, 0, 0, 16, 128, 2, 8, 36, 32, 129, 4, 144,
13, 0, 0, 3, 8, 1, 0, 2, 0, 0, 64, 0, 5, 0, 1, 34, 1, 32, 2, 16, 128, 128, 128,
0, 0, 0, 2, 0, 4, 18, 8, 12, 34, 32, 192, 6, 64, 224, 33, 0, 0, 137, 72, 64, 0,
24, 8, 128, 128, 0, 16, 0, 32, 128, 128, 132, 8, 0, 0, 16, 0, 64, 0, 0, 4, 0, 0,
16, 0, 4, 128, 64, 0, 0, 1, 0, 4, 64, 32, 144, 130, 2, 128, 0, 192, 0, 64, 82,
64, 1, 32, 128, 128, 2, 0, 84, 0, 32, 0, 44, 24, 72, 80, 32, 16, 0, 0, 44, 16,
96, 64, 1, 72, 131, 0, 0, 0, 16, 0, 0, 165, 0, 129, 2, 49, 48, 64, 64, 12, 64,
176, 64, 84, 8, 128, 20, 64, 213, 136, 104, 1, 41, 15, 83, 170, 0, 0, 41, 1, 64,
64, 0, 193, 64, 64, 8, 0, 128, 0, 0, 64, 8, 64, 8, 1, 16, 0, 8, 0, 0, 2, 1, 128,
28, 84, 141, 97, 0, 0, 68, 0, 0, 129, 8, 0, 16, 8, 32, 0, 64, 0, 0, 0, 24, 0, 0,
0, 192, 0, 8, 128, 0, 0, 0, 0, 0, 64, 0, 1, 0, 0, 0, 0, 40, 1, 128, 64, 0, 4, 2,
32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 128, 32, 8, 0, 32, 0, 0, 0, 16, 17, 0,
2, 4, 0, 0, 33, 128, 2, 0, 0, 0, 0, 129, 0, 2, 0, 0, 0, 36, 0, 32, 2, 0, 0, 0,
0, 0, 0, 32, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 4, 32, 64, 0, 0, 0, 0, 0, 0,
32, 0, 0, 32, 128, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 128, 16, 0, 0, 0, 0, 0, 0, 0,
1, 0, 136, 0, 0, 24, 192, 128, 3, 0, 17, 18, 2, 0, 66, 0, 4, 24, 0, 9, 208, 167,
0, 144, 20, 64, 0, 130, 64, 0, 2, 16, 136, 8, 74, 32, 0, 168, 0, 65, 32, 8, 12,
1, 3, 1, 64, 180, 3, 0, 64, 0, 8, 0, 0, 32, 65, 0, 4, 16, 4, 16, 68, 32, 64, 36,
32, 24, 33, 1, 128, 0, 0, 8, 0, 32, 64, 81, 0, 1, 10, 19, 8, 0, 0, 4, 5, 144, 0,
0, 8, 128, 0, 0, 4, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 8, 0, 0, 0, 0, 0, 80, 1, 0, 0,
33, 0, 32, 66, 4, 2, 0, 1, 43, 2, 0, 0, 4, 32, 16, 0, 64, 0, 3, 32, 0, 2, 64,
64, 116, 0, 65, 52, 64, 0, 17, 64, 192, 96, 8, 10, 8, 2, 4, 0, 17, 64, 0, 4, 0,
0, 4, 128, 0, 0, 9, 0, 0, 130, 2, 0, 192, 0, 48, 128, 64, 0, 96, 0, 64, 0, 1,
16, 32, 0, 1, 32, 6, 128, 2, 32, 0, 12, 0, 0, 48, 32, 8, 0, 0, 128, 0, 18, 0,
0, 28, 24, 41, 16, 5, 32, 0, 0, 0, 0, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 16, 0, 0, 0, 0, 64, 0, 0, 0, 0, 8, 0, 0, 0, 0, 16, 128,
0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 33, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 16, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0};
memset(lengths, 0, BROTLI_ENC_NUM_HASH_BUCKETS);
for (len = BROTLI_MAX_DICTIONARY_WORD_LENGTH;
len >= BROTLI_MIN_DICTIONARY_WORD_LENGTH; --len) {
size_t length_lt_8 = len < 8 ? 1 : 0;
size_t n = 1u << dict->size_bits_by_length[len];
const uint8_t* dict_words = dict->data + dict->offsets_by_length[len];
for (i = 0; i < n; ++i) {
size_t j = n - 1 - i;
const uint8_t* word = dict_words + len * j;
const uint32_t key = Hash14(word);
size_t idx = (key << 1) + length_lt_8;
if ((lengths[idx] & 0x80) == 0) {
BROTLI_BOOL is_final = TO_BROTLI_BOOL(frozen_idx[global_idx / 8] &
(1u << (global_idx % 8)));
words[idx] = (uint16_t)j;
lengths[idx] = (uint8_t)(len + (is_final ? 0x80 : 0));
}
global_idx++;
}
}
for (i = 0; i < BROTLI_ENC_NUM_HASH_BUCKETS; ++i) {
lengths[i] &= 0x7F;
}
return BROTLI_TRUE;
}
BROTLI_MODEL("small")
uint16_t kStaticDictionaryHashWords[BROTLI_ENC_NUM_HASH_BUCKETS];
BROTLI_MODEL("small")
uint8_t kStaticDictionaryHashLengths[BROTLI_ENC_NUM_HASH_BUCKETS];
#else /* BROTLI_STATIC_INIT */
/* Embed kStaticDictionaryHashWords and kStaticDictionaryHashLengths. */
#include "dictionary_hash_inc.h"
#endif /* BROTLI_STATIC_INIT */
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

45
c/enc/dictionary_hash.h Normal file
View File

@@ -0,0 +1,45 @@
/* Copyright 2015 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Hash table on the 4-byte prefixes of static dictionary words. */
#ifndef BROTLI_ENC_DICTIONARY_HASH_H_
#define BROTLI_ENC_DICTIONARY_HASH_H_
#include "../common/platform.h"
#include "../common/static_init.h"
#if (BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_NONE)
#include "../common/dictionary.h"
#endif
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* Bucket is (Hash14 * 2 + length_lt_8); in other words we reserve 2 buckets
for each hash - one for shorter words and one for longer words. */
#define BROTLI_ENC_NUM_HASH_BUCKETS 32768
#if (BROTLI_STATIC_INIT != BROTLI_STATIC_INIT_NONE)
BROTLI_BOOL BROTLI_INTERNAL BrotliEncoderInitDictionaryHash(
const BrotliDictionary* dictionary, uint16_t* words, uint8_t* lengths);
BROTLI_INTERNAL extern BROTLI_MODEL("small") uint16_t
kStaticDictionaryHashWords[BROTLI_ENC_NUM_HASH_BUCKETS];
BROTLI_INTERNAL extern BROTLI_MODEL("small") uint8_t
kStaticDictionaryHashLengths[BROTLI_ENC_NUM_HASH_BUCKETS];
#else
BROTLI_INTERNAL extern const BROTLI_MODEL("small") uint16_t
kStaticDictionaryHashWords[BROTLI_ENC_NUM_HASH_BUCKETS];
BROTLI_INTERNAL extern const BROTLI_MODEL("small") uint8_t
kStaticDictionaryHashLengths[BROTLI_ENC_NUM_HASH_BUCKETS];
#endif
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_DICTIONARY_HASH_H_ */

1829
c/enc/dictionary_hash_inc.h Normal file

File diff suppressed because it is too large Load Diff

2021
c/enc/encode.c Normal file

File diff suppressed because it is too large Load Diff

642
c/enc/encoder_dict.c Normal file
View File

@@ -0,0 +1,642 @@
/* Copyright 2017 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#include "encoder_dict.h"
#include "../common/dictionary.h"
#include "../common/platform.h"
#include <brotli/shared_dictionary.h>
#include "../common/shared_dictionary_internal.h"
#include "../common/transform.h"
#include <brotli/encode.h>
#include "compound_dictionary.h"
#include "dictionary_hash.h"
#include "hash_base.h"
#include "hash.h"
#include "memory.h"
#include "quality.h"
#include "static_dict_lut.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
#define NUM_HASH_BITS 15u
#define NUM_HASH_BUCKETS (1u << NUM_HASH_BITS)
static void BrotliTrieInit(BrotliTrie* trie) {
trie->pool_capacity = 0;
trie->pool_size = 0;
trie->pool = 0;
/* Set up the root node */
trie->root.single = 0;
trie->root.len_ = 0;
trie->root.idx_ = 0;
trie->root.sub = 0;
}
static void BrotliTrieFree(MemoryManager* m, BrotliTrie* trie) {
BrotliFree(m, trie->pool);
}
/* Initializes to RFC 7932 static dictionary / transforms. */
static void InitEncoderDictionary(BrotliEncoderDictionary* dict) {
dict->words = BrotliGetDictionary();
dict->num_transforms = (uint32_t)BrotliGetTransforms()->num_transforms;
dict->hash_table_words = kStaticDictionaryHashWords;
dict->hash_table_lengths = kStaticDictionaryHashLengths;
dict->buckets = kStaticDictionaryBuckets;
dict->dict_words = kStaticDictionaryWords;
dict->cutoffTransformsCount = kCutoffTransformsCount;
dict->cutoffTransforms = kCutoffTransforms;
dict->parent = 0;
dict->hash_table_data_words_ = 0;
dict->hash_table_data_lengths_ = 0;
dict->buckets_alloc_size_ = 0;
dict->buckets_data_ = 0;
dict->dict_words_alloc_size_ = 0;
dict->dict_words_data_ = 0;
dict->words_instance_ = 0;
dict->has_words_heavy = BROTLI_FALSE;
BrotliTrieInit(&dict->trie);
}
static void BrotliDestroyEncoderDictionary(MemoryManager* m,
BrotliEncoderDictionary* dict) {
BrotliFree(m, dict->hash_table_data_words_);
BrotliFree(m, dict->hash_table_data_lengths_);
BrotliFree(m, dict->buckets_data_);
BrotliFree(m, dict->dict_words_data_);
BrotliFree(m, dict->words_instance_);
BrotliTrieFree(m, &dict->trie);
}
#if defined(BROTLI_EXPERIMENTAL)
/* Word length must be at least 4 bytes */
static uint32_t Hash(const uint8_t* data, int bits) {
uint32_t h = BROTLI_UNALIGNED_LOAD32LE(data) * kHashMul32;
/* The higher bits contain more mixture from the multiplication,
so we take our results from there. */
return h >> (32 - bits);
}
/* Theoretical max possible word size after transform */
#define kTransformedBufferSize \
(256 + 256 + SHARED_BROTLI_MAX_DICTIONARY_WORD_LENGTH)
/* To be safe buffer must have at least kTransformedBufferSize */
static void TransformedDictionaryWord(uint32_t word_idx, int len, int transform,
const BrotliTransforms* transforms,
const BrotliEncoderDictionary* dict,
uint8_t* buffer, size_t* size) {
const uint8_t* dict_word = &dict->words->data[
dict->words->offsets_by_length[len] + (uint32_t)len * word_idx];
*size = (size_t)BrotliTransformDictionaryWord(buffer, dict_word, len,
transforms, transform);
}
static DictWord MakeDictWord(uint8_t len, uint8_t transform, uint16_t idx) {
DictWord result;
result.len = len;
result.transform = transform;
result.idx = idx;
return result;
}
static uint32_t BrotliTrieAlloc(MemoryManager* m, size_t num, BrotliTrie* trie,
BrotliTrieNode** keep) {
uint32_t result;
uint32_t keep_index = 0;
if (keep && *keep != &trie->root) {
/* Optional node to keep, since address may change after re-allocating */
keep_index = (uint32_t)(*keep - trie->pool);
}
if (trie->pool_size == 0) {
/* Have a placeholder node in the front. We do not want the result to be 0,
it must be at least 1, 0 represents "null pointer" */
trie->pool_size = 1;
}
BROTLI_ENSURE_CAPACITY(m, BrotliTrieNode, trie->pool, trie->pool_capacity,
trie->pool_size + num);
if (BROTLI_IS_OOM(m)) return 0;
/* Init the new nodes to empty */
memset(trie->pool + trie->pool_size, 0, sizeof(*trie->pool) * num);
result = (uint32_t)trie->pool_size;
trie->pool_size += num;
if (keep && *keep != &trie->root) {
*keep = trie->pool + keep_index;
}
return result;
}
/**
* len and idx: payload for last node
* word, size: the string
* index: position in the string
*/
static BROTLI_BOOL BrotliTrieNodeAdd(MemoryManager* m, uint8_t len,
uint32_t idx, const uint8_t* word, size_t size, int index,
BrotliTrieNode* node, BrotliTrie* trie) {
BrotliTrieNode* child = 0;
uint8_t c;
if ((size_t)index == size) {
if (!node->len_ || idx < node->idx_) {
node->len_ = len;
node->idx_ = idx;
}
return BROTLI_TRUE;
}
c = word[index];
if (node->single && c != node->c) {
BrotliTrieNode old = trie->pool[node->sub];
uint32_t new_nodes = BrotliTrieAlloc(m, 32, trie, &node);
if (BROTLI_IS_OOM(m)) return BROTLI_FALSE;
node->single = 0;
node->sub = new_nodes;
trie->pool[node->sub + (node->c >> 4)].sub = new_nodes + 16;
trie->pool[trie->pool[node->sub + (node->c >> 4)].sub + (node->c & 15)] =
old;
}
if (!node->sub) {
uint32_t new_node = BrotliTrieAlloc(m, 1, trie, &node);
if (BROTLI_IS_OOM(m)) return BROTLI_FALSE;
node->single = 1;
node->c = c;
node->sub = new_node;
}
if (node->single) {
child = &trie->pool[node->sub];
} else {
if (!trie->pool[node->sub + (c >> 4)].sub) {
uint32_t new_nodes = BrotliTrieAlloc(m, 16, trie, &node);
if (BROTLI_IS_OOM(m)) return BROTLI_FALSE;
trie->pool[node->sub + (c >> 4)].sub = new_nodes;
}
child = &trie->pool[trie->pool[node->sub + (c >> 4)].sub + (c & 15)];
}
return BrotliTrieNodeAdd(m, len, idx, word, size, index + 1, child, trie);
}
static BROTLI_BOOL BrotliTrieAdd(MemoryManager* m, uint8_t len, uint32_t idx,
const uint8_t* word, size_t size, BrotliTrie* trie) {
return BrotliTrieNodeAdd(m, len, idx, word, size, 0, &trie->root, trie);
}
const BrotliTrieNode* BrotliTrieSub(const BrotliTrie* trie,
const BrotliTrieNode* node, uint8_t c) {
BrotliTrieNode* temp_node;
if (node->single) {
if (node->c == c) return &trie->pool[node->sub];
return 0;
}
if (!node->sub) return 0;
temp_node = &trie->pool[node->sub + (c >> 4)];
if (!temp_node->sub) return 0;
return &trie->pool[temp_node->sub + (c & 15)];
}
static const BrotliTrieNode* BrotliTrieFind(const BrotliTrie* trie,
const uint8_t* word, size_t size) {
const BrotliTrieNode* node = &trie->root;
size_t i;
for (i = 0; i < size; i++) {
node = BrotliTrieSub(trie, node, word[i]);
if (!node) return 0;
}
return node;
}
static BROTLI_BOOL BuildDictionaryLut(MemoryManager* m,
const BrotliTransforms* transforms,
BrotliEncoderDictionary* dict) {
uint32_t i;
DictWord* dict_words;
uint16_t* buckets;
DictWord** words_by_hash;
size_t* words_by_hash_size;
size_t* words_by_hash_capacity;
BrotliTrie dedup;
uint8_t word[kTransformedBufferSize];
size_t word_size;
size_t total = 0;
uint8_t l;
uint16_t idx;
BrotliTrieInit(&dedup);
words_by_hash = (DictWord**)BrotliAllocate(m,
sizeof(*words_by_hash) * NUM_HASH_BUCKETS);
words_by_hash_size = (size_t*)BrotliAllocate(m,
sizeof(*words_by_hash_size) * NUM_HASH_BUCKETS);
words_by_hash_capacity = (size_t*)BrotliAllocate(m,
sizeof(*words_by_hash_capacity) * NUM_HASH_BUCKETS);
if (BROTLI_IS_OOM(m)) return BROTLI_FALSE;
memset(words_by_hash, 0, sizeof(*words_by_hash) * NUM_HASH_BUCKETS);
memset(words_by_hash_size, 0, sizeof(*words_by_hash_size) * NUM_HASH_BUCKETS);
memset(words_by_hash_capacity, 0,
sizeof(*words_by_hash_capacity) * NUM_HASH_BUCKETS);
if (transforms->num_transforms > 0) {
for (l = SHARED_BROTLI_MIN_DICTIONARY_WORD_LENGTH;
l <= SHARED_BROTLI_MAX_DICTIONARY_WORD_LENGTH; ++l) {
uint16_t n = dict->words->size_bits_by_length[l] ?
(uint16_t)(1 << dict->words->size_bits_by_length[l]) : 0u;
for (idx = 0; idx < n; ++idx) {
uint32_t key;
/* First transform (usually identity) */
TransformedDictionaryWord(idx, l, 0, transforms, dict, word,
&word_size);
/* Cannot hash words smaller than 4 bytes */
if (word_size < 4) {
/* Break instead of continue, all next words of this length will have
same length after transform */
break;
}
if (!BrotliTrieAdd(m, 0, idx, word, word_size, &dedup)) {
return BROTLI_FALSE;
}
key = Hash(word, NUM_HASH_BITS);
BROTLI_ENSURE_CAPACITY_APPEND(m, DictWord, words_by_hash[key],
words_by_hash_capacity[key], words_by_hash_size[key],
MakeDictWord(l, 0, idx));
++total;
}
}
}
/* These LUT transforms only supported if no custom transforms. This is
ok, we will use the heavy trie instead. */
if (transforms == BrotliGetTransforms()) {
for (l = SHARED_BROTLI_MIN_DICTIONARY_WORD_LENGTH;
l <= SHARED_BROTLI_MAX_DICTIONARY_WORD_LENGTH; ++l) {
uint16_t n = dict->words->size_bits_by_length[l] ?
(uint16_t)(1 << dict->words->size_bits_by_length[l]) : 0u;
for (idx = 0; idx < n; ++idx) {
int k;
BROTLI_BOOL is_ascii = BROTLI_TRUE;
size_t offset = dict->words->offsets_by_length[l] + (size_t)l * idx;
const uint8_t* data = &dict->words->data[offset];
for (k = 0; k < l; ++k) {
if (data[k] >= 128) is_ascii = BROTLI_FALSE;
}
if (data[0] < 128) {
int transform = 9; /* {empty, uppercase first, empty} */
uint32_t ix = idx + (uint32_t)transform * n;
const BrotliTrieNode* it;
TransformedDictionaryWord(idx, l, transform, transforms,
dict, word, &word_size);
it = BrotliTrieFind(&dedup, word, word_size);
if (!it || it->idx_ > ix) {
uint32_t key = Hash(word, NUM_HASH_BITS);
if (!BrotliTrieAdd(m, 0, ix, word, word_size, &dedup)) {
return BROTLI_FALSE;
}
BROTLI_ENSURE_CAPACITY_APPEND(m, DictWord, words_by_hash[key],
words_by_hash_capacity[key], words_by_hash_size[key],
MakeDictWord(l, BROTLI_TRANSFORM_UPPERCASE_FIRST, idx));
++total;
}
}
if (is_ascii) {
int transform = 44; /* {empty, uppercase all, empty} */
uint32_t ix = idx + (uint32_t)transform * n;
const BrotliTrieNode* it;
TransformedDictionaryWord(idx, l, transform, transforms,
dict, word, &word_size);
it = BrotliTrieFind(&dedup, word, word_size);
if (!it || it->idx_ > ix) {
uint32_t key = Hash(word, NUM_HASH_BITS);
if (!BrotliTrieAdd(m, 0, ix, word, word_size, &dedup)) {
return BROTLI_FALSE;
}
BROTLI_ENSURE_CAPACITY_APPEND(m, DictWord, words_by_hash[key],
words_by_hash_capacity[key], words_by_hash_size[key],
MakeDictWord(l, BROTLI_TRANSFORM_UPPERCASE_ALL, idx));
++total;
}
}
}
}
}
dict_words = (DictWord*)BrotliAllocate(m,
sizeof(*dict->dict_words) * (total + 1));
buckets = (uint16_t*)BrotliAllocate(m,
sizeof(*dict->buckets) * NUM_HASH_BUCKETS);
if (BROTLI_IS_OOM(m)) return BROTLI_FALSE;
dict->dict_words_alloc_size_ = total + 1;
dict->dict_words = dict->dict_words_data_ = dict_words;
dict->buckets_alloc_size_ = NUM_HASH_BUCKETS;
dict->buckets = dict->buckets_data_ = buckets;
/* Unused; makes offsets start from 1. */
dict_words[0] = MakeDictWord(0, 0, 0);
total = 1;
for (i = 0; i < NUM_HASH_BUCKETS; ++i) {
size_t num_words = words_by_hash_size[i];
if (num_words > 0) {
buckets[i] = (uint16_t)(total);
memcpy(&dict_words[total], &words_by_hash[i][0],
sizeof(dict_words[0]) * num_words);
total += num_words;
dict_words[total - 1].len |= 0x80;
} else {
buckets[i] = 0;
}
}
for (i = 0; i < NUM_HASH_BUCKETS; ++i) {
BrotliFree(m, words_by_hash[i]);
}
BrotliFree(m, words_by_hash);
BrotliFree(m, words_by_hash_size);
BrotliFree(m, words_by_hash_capacity);
BrotliTrieFree(m, &dedup);
return BROTLI_TRUE;
}
static void BuildDictionaryHashTable(uint16_t* hash_table_words,
uint8_t* hash_table_lengths, const BrotliDictionary* dict) {
int j, len;
/* The order of the loops is such that in case of collision, words with
shorter length are preferred, and in case of same length, words with
smaller index. There is only a single word per bucket. */
/* TODO(lode): consider adding optional user-supplied frequency_map to use
for preferred words instead, this can make the encoder better for
quality 9 and below without affecting the decoder */
memset(hash_table_words, 0, sizeof(kStaticDictionaryHashWords));
memset(hash_table_lengths, 0, sizeof(kStaticDictionaryHashLengths));
for (len = SHARED_BROTLI_MAX_DICTIONARY_WORD_LENGTH;
len >= SHARED_BROTLI_MIN_DICTIONARY_WORD_LENGTH; --len) {
const size_t num_words = dict->size_bits_by_length[len] ?
(1u << dict->size_bits_by_length[len]) : 0;
for (j = (int)num_words - 1; j >= 0; --j) {
size_t offset = dict->offsets_by_length[len] +
(size_t)len * (size_t)j;
const uint8_t* word = &dict->data[offset];
const uint32_t key = Hash(word, 14);
int idx = (int)(key << 1) + (len < 8 ? 1 : 0);
BROTLI_DCHECK(idx < (int)NUM_HASH_BUCKETS);
hash_table_words[idx] = (uint16_t)j;
hash_table_lengths[idx] = (uint8_t)len;
}
}
}
static BROTLI_BOOL GenerateWordsHeavy(MemoryManager* m,
const BrotliTransforms* transforms,
BrotliEncoderDictionary* dict) {
int i, j, l;
for (j = (int)transforms->num_transforms - 1; j >= 0 ; --j) {
for (l = 0; l < 32; l++) {
int num = (int)((1u << dict->words->size_bits_by_length[l]) & ~1u);
for (i = 0; i < num; i++) {
uint8_t transformed[kTransformedBufferSize];
size_t size;
TransformedDictionaryWord(
(uint32_t)i, l, j, transforms, dict, transformed, &size);
if (size < 4) continue;
if (!BrotliTrieAdd(m, (uint8_t)l, (uint32_t)(i + num * j),
transformed, size, &dict->trie)) {
return BROTLI_FALSE;
}
}
}
}
return BROTLI_TRUE;
}
/* Computes cutoffTransformsCount (in count) and cutoffTransforms (in data) for
the custom transforms, where possible within the limits of the
cutoffTransforms encoding. The fast encoder uses this to do fast lookup for
transforms that remove the N last characters (OmitLast). */
static void ComputeCutoffTransforms(
const BrotliTransforms* transforms,
uint32_t* count, uint64_t* data) {
int i;
/* The encoding in a 64-bit integer of transform N in the data is: (N << 2) +
((cutoffTransforms >> (N * 6)) & 0x3F), so for example the identity
transform code must be 0-63, for N=1 the transform code must be 4-67, ...,
for N=9 it must be 36-99.
TODO(lode): consider a simple flexible uint8_t[10] instead of the uint64_t
for the cutoff transforms, so that shared dictionaries can have the
OmitLast transforms anywhere without loss. */
*count = 0;
*data = 0;
for (i = 0; i < BROTLI_TRANSFORMS_MAX_CUT_OFF + 1; i++) {
int idx = transforms->cutOffTransforms[i];
if (idx == -1) break; /* Not found */
if (idx < (i << 2)) break; /* Too small for the encoding */
if (idx >= (i << 2) + 64) break; /* Too large for the encoding */
(*count)++;
*data |= (uint64_t)(((uint64_t)idx -
((uint64_t)i << 2u)) << ((uint64_t)i * 6u));
}
}
static BROTLI_BOOL ComputeDictionary(MemoryManager* m, int quality,
const BrotliTransforms* transforms,
BrotliEncoderDictionary* current) {
int default_words = current->words == BrotliGetDictionary();
int default_transforms = transforms == BrotliGetTransforms();
if (default_words && default_transforms) {
/* hashes are already set to Brotli defaults */
return BROTLI_TRUE;
}
current->hash_table_data_words_ = (uint16_t*)BrotliAllocate(
m, sizeof(kStaticDictionaryHashWords));
current->hash_table_data_lengths_ = (uint8_t*)BrotliAllocate(
m, sizeof(kStaticDictionaryHashLengths));
if (BROTLI_IS_OOM(m)) return BROTLI_FALSE;
current->hash_table_words = current->hash_table_data_words_;
current->hash_table_lengths = current->hash_table_data_lengths_;
BuildDictionaryHashTable(current->hash_table_data_words_,
current->hash_table_data_lengths_, current->words);
ComputeCutoffTransforms(transforms,
&current->cutoffTransformsCount, &current->cutoffTransforms);
/* Only compute the data for slow encoder if the requested quality is high
enough to need it */
if (quality >= ZOPFLIFICATION_QUALITY) {
if (!BuildDictionaryLut(m, transforms, current)) return BROTLI_FALSE;
/* For the built-in Brotli transforms, there is a hard-coded function to
handle all transforms, but for custom transforms, we use the following
large hammer instead */
current->has_words_heavy = !default_transforms;
if (current->has_words_heavy) {
if (!GenerateWordsHeavy(m, transforms, current)) return BROTLI_FALSE;
}
}
return BROTLI_TRUE;
}
#endif /* BROTLI_EXPERIMENTAL */
void BrotliInitSharedEncoderDictionary(SharedEncoderDictionary* dict) {
dict->magic = kSharedDictionaryMagic;
dict->compound.num_chunks = 0;
dict->compound.total_size = 0;
dict->compound.chunk_offsets[0] = 0;
dict->compound.num_prepared_instances_ = 0;
dict->contextual.context_based = 0;
dict->contextual.num_dictionaries = 1;
dict->contextual.instances_ = 0;
dict->contextual.num_instances_ = 1; /* The instance_ field */
dict->contextual.dict[0] = &dict->contextual.instance_;
InitEncoderDictionary(&dict->contextual.instance_);
dict->contextual.instance_.parent = &dict->contextual;
dict->max_quality = BROTLI_MAX_QUALITY;
}
#if defined(BROTLI_EXPERIMENTAL)
/* TODO(eustas): make sure that tooling will warn user if not all the cutoff
transforms are available (for low-quality encoder). */
static BROTLI_BOOL InitCustomSharedEncoderDictionary(
MemoryManager* m, const BrotliSharedDictionary* decoded_dict,
int quality, SharedEncoderDictionary* dict) {
ContextualEncoderDictionary* contextual;
CompoundDictionary* compound;
BrotliEncoderDictionary* instances;
int i;
BrotliInitSharedEncoderDictionary(dict);
contextual = &dict->contextual;
compound = &dict->compound;
for (i = 0; i < (int)decoded_dict->num_prefix; i++) {
PreparedDictionary* prepared = CreatePreparedDictionary(m,
decoded_dict->prefix[i], decoded_dict->prefix_size[i]);
AttachPreparedDictionary(compound, prepared);
/* remember for cleanup */
compound->prepared_instances_[
compound->num_prepared_instances_++] = prepared;
}
dict->max_quality = quality;
contextual->context_based = decoded_dict->context_based;
if (decoded_dict->context_based) {
memcpy(contextual->context_map, decoded_dict->context_map,
SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS);
}
contextual->num_dictionaries = decoded_dict->num_dictionaries;
contextual->num_instances_ = decoded_dict->num_dictionaries;
if (contextual->num_instances_ == 1) {
instances = &contextual->instance_;
} else {
contextual->instances_ = (BrotliEncoderDictionary*)
BrotliAllocate(m, sizeof(*contextual->instances_) *
contextual->num_instances_);
if (BROTLI_IS_OOM(m)) return BROTLI_FALSE;
instances = contextual->instances_;
}
for (i = 0; i < (int)contextual->num_instances_; i++) {
BrotliEncoderDictionary* current = &instances[i];
InitEncoderDictionary(current);
current->parent = &dict->contextual;
if (decoded_dict->words[i] == BrotliGetDictionary()) {
current->words = BrotliGetDictionary();
} else {
current->words_instance_ = (BrotliDictionary*)BrotliAllocate(
m, sizeof(BrotliDictionary));
if (BROTLI_IS_OOM(m)) return BROTLI_FALSE;
*current->words_instance_ = *decoded_dict->words[i];
current->words = current->words_instance_;
}
current->num_transforms =
(uint32_t)decoded_dict->transforms[i]->num_transforms;
if (!ComputeDictionary(
m, quality, decoded_dict->transforms[i], current)) {
return BROTLI_FALSE;
}
contextual->dict[i] = current;
}
return BROTLI_TRUE; /* success */
}
BROTLI_BOOL BrotliInitCustomSharedEncoderDictionary(
MemoryManager* m, const uint8_t* encoded_dict, size_t size,
int quality, SharedEncoderDictionary* dict) {
BROTLI_BOOL success = BROTLI_FALSE;
BrotliSharedDictionary* decoded_dict = BrotliSharedDictionaryCreateInstance(
m->alloc_func, m->free_func, m->opaque);
if (!decoded_dict) { /* OOM */
return BROTLI_FALSE;
}
success = BrotliSharedDictionaryAttach(
decoded_dict, BROTLI_SHARED_DICTIONARY_SERIALIZED, size, encoded_dict);
if (success) {
success = InitCustomSharedEncoderDictionary(m,
decoded_dict, quality, dict);
}
BrotliSharedDictionaryDestroyInstance(decoded_dict);
return success;
}
#endif /* BROTLI_EXPERIMENTAL */
void BrotliCleanupSharedEncoderDictionary(MemoryManager* m,
SharedEncoderDictionary* dict) {
size_t i;
for (i = 0; i < dict->compound.num_prepared_instances_; i++) {
DestroyPreparedDictionary(m,
(PreparedDictionary*)dict->compound.prepared_instances_[i]);
}
if (dict->contextual.num_instances_ == 1) {
BrotliDestroyEncoderDictionary(m, &dict->contextual.instance_);
} else if (dict->contextual.num_instances_ > 1) {
for (i = 0; i < dict->contextual.num_instances_; i++) {
BrotliDestroyEncoderDictionary(m, &dict->contextual.instances_[i]);
}
BrotliFree(m, dict->contextual.instances_);
}
}
ManagedDictionary* BrotliCreateManagedDictionary(
brotli_alloc_func alloc_func, brotli_free_func free_func, void* opaque) {
ManagedDictionary* result = (ManagedDictionary*)BrotliBootstrapAlloc(
sizeof(ManagedDictionary), alloc_func, free_func, opaque);
if (result == NULL) return NULL;
result->magic = kManagedDictionaryMagic;
BrotliInitMemoryManager(
&result->memory_manager_, alloc_func, free_func, opaque);
result->dictionary = NULL;
return result;
}
void BrotliDestroyManagedDictionary(ManagedDictionary* dictionary) {
if (!dictionary) return;
BrotliBootstrapFree(dictionary, &dictionary->memory_manager_);
}
/* Escalate internal functions visibility; for testing purposes only. */
#if defined(BROTLI_TEST)
void BrotliInitEncoderDictionaryForTest(BrotliEncoderDictionary*);
void BrotliInitEncoderDictionaryForTest(BrotliEncoderDictionary* d) {
InitEncoderDictionary(d);
}
#endif
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

155
c/enc/encoder_dict.h Normal file
View File

@@ -0,0 +1,155 @@
/* Copyright 2017 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
#ifndef BROTLI_ENC_ENCODER_DICT_H_
#define BROTLI_ENC_ENCODER_DICT_H_
#include "../common/dictionary.h"
#include <brotli/shared_dictionary.h>
#include "../common/platform.h"
#include "compound_dictionary.h"
#include "memory.h"
#include "static_dict_lut.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/*
Dictionary hierarchy for Encoder:
-SharedEncoderDictionary
--CompoundDictionary
---PreparedDictionary [up to 15x]
= prefix dictionary with precomputed hashes
--ContextualEncoderDictionary
---BrotliEncoderDictionary [up to 64x]
= for each context, precomputed static dictionary with words + transforms
Dictionary hierarchy from common: similar, but without precomputed hashes
-BrotliSharedDictionary
--BrotliDictionary [up to 64x]
--BrotliTransforms [up to 64x]
--const uint8_t* prefix [up to 15x]: compound dictionaries
*/
typedef struct BrotliTrieNode {
uint8_t single; /* if 1, sub is a single node for c instead of 256 */
uint8_t c;
uint8_t len_; /* untransformed length */
uint32_t idx_; /* word index + num words * transform index */
uint32_t sub; /* index of sub node(s) in the pool */
} BrotliTrieNode;
typedef struct BrotliTrie {
BrotliTrieNode* pool;
size_t pool_capacity;
size_t pool_size;
BrotliTrieNode root;
} BrotliTrie;
#if defined(BROTLI_EXPERIMENTAL)
BROTLI_INTERNAL const BrotliTrieNode* BrotliTrieSub(const BrotliTrie* trie,
const BrotliTrieNode* node, uint8_t c);
#endif /* BROTLI_EXPERIMENTAL */
/* Dictionary data (words and transforms) for 1 possible context */
typedef struct BrotliEncoderDictionary {
const BrotliDictionary* words;
uint32_t num_transforms;
/* cut off for fast encoder */
uint32_t cutoffTransformsCount;
uint64_t cutoffTransforms;
/* from dictionary_hash.h, for fast encoder */
const uint16_t* hash_table_words;
const uint8_t* hash_table_lengths;
/* from static_dict_lut.h, for slow encoder */
const uint16_t* buckets;
const DictWord* dict_words;
/* Heavy version, for use by slow encoder when there are custom transforms.
Contains every possible transformed dictionary word in a trie. It encodes
about as fast as the non-heavy encoder but consumes a lot of memory and
takes time to build. */
BrotliTrie trie;
BROTLI_BOOL has_words_heavy;
/* Reference to other dictionaries. */
const struct ContextualEncoderDictionary* parent;
/* Allocated memory, used only when not using the Brotli defaults */
uint16_t* hash_table_data_words_;
uint8_t* hash_table_data_lengths_;
size_t buckets_alloc_size_;
uint16_t* buckets_data_;
size_t dict_words_alloc_size_;
DictWord* dict_words_data_;
BrotliDictionary* words_instance_;
} BrotliEncoderDictionary;
/* Dictionary data for all 64 contexts */
typedef struct ContextualEncoderDictionary {
BROTLI_BOOL context_based;
uint8_t num_dictionaries;
uint8_t context_map[SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS];
const BrotliEncoderDictionary* dict[SHARED_BROTLI_NUM_DICTIONARY_CONTEXTS];
/* If num_instances_ is 1, instance_ is used, else dynamic allocation with
instances_ is used. */
size_t num_instances_;
BrotliEncoderDictionary instance_;
BrotliEncoderDictionary* instances_;
} ContextualEncoderDictionary;
typedef struct SharedEncoderDictionary {
/* Magic value to distinguish this struct from PreparedDictionary for
certain external usages. */
uint32_t magic;
/* LZ77 prefix, compound dictionary */
CompoundDictionary compound;
/* Custom static dictionary (optionally context-based) */
ContextualEncoderDictionary contextual;
/* The maximum quality the dictionary was computed for */
int max_quality;
} SharedEncoderDictionary;
typedef struct ManagedDictionary {
uint32_t magic;
MemoryManager memory_manager_;
uint32_t* dictionary;
} ManagedDictionary;
/* Initializes to the brotli built-in dictionary */
BROTLI_INTERNAL void BrotliInitSharedEncoderDictionary(
SharedEncoderDictionary* dict);
#if defined(BROTLI_EXPERIMENTAL)
/* Initializes to shared dictionary that will be parsed from
encoded_dict. Requires that you keep the encoded_dict buffer
around, parts of data will point to it. */
BROTLI_INTERNAL BROTLI_BOOL BrotliInitCustomSharedEncoderDictionary(
MemoryManager* m, const uint8_t* encoded_dict, size_t size,
int quality, SharedEncoderDictionary* dict);
#endif /* BROTLI_EXPERIMENTAL */
BROTLI_INTERNAL void BrotliCleanupSharedEncoderDictionary(
MemoryManager* m, SharedEncoderDictionary* dict);
BROTLI_INTERNAL ManagedDictionary* BrotliCreateManagedDictionary(
brotli_alloc_func alloc_func, brotli_free_func free_func, void* opaque);
BROTLI_INTERNAL void BrotliDestroyManagedDictionary(
ManagedDictionary* dictionary);
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_ENCODER_DICT_H_ */

501
c/enc/entropy_encode.c Normal file
View File

@@ -0,0 +1,501 @@
/* Copyright 2010 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Entropy encoding (Huffman) utilities. */
#include "entropy_encode.h"
#include "../common/constants.h"
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
const BROTLI_MODEL("small") size_t kBrotliShellGaps[] = {132, 57, 23, 10, 4, 1};
BROTLI_BOOL BrotliSetDepth(
int p0, HuffmanTree* pool, uint8_t* depth, int max_depth) {
int stack[16];
int level = 0;
int p = p0;
BROTLI_DCHECK(max_depth <= 15);
stack[0] = -1;
while (BROTLI_TRUE) {
if (pool[p].index_left_ >= 0) {
level++;
if (level > max_depth) return BROTLI_FALSE;
stack[level] = pool[p].index_right_or_value_;
p = pool[p].index_left_;
continue;
} else {
depth[pool[p].index_right_or_value_] = (uint8_t)level;
}
while (level >= 0 && stack[level] == -1) level--;
if (level < 0) return BROTLI_TRUE;
p = stack[level];
stack[level] = -1;
}
}
/* Sort the root nodes, least popular first. */
static BROTLI_INLINE BROTLI_BOOL SortHuffmanTree(
const HuffmanTree* v0, const HuffmanTree* v1) {
if (v0->total_count_ != v1->total_count_) {
return TO_BROTLI_BOOL(v0->total_count_ < v1->total_count_);
}
return TO_BROTLI_BOOL(v0->index_right_or_value_ > v1->index_right_or_value_);
}
/* This function will create a Huffman tree.
The catch here is that the tree cannot be arbitrarily deep.
Brotli specifies a maximum depth of 15 bits for "code trees"
and 7 bits for "code length code trees."
count_limit is the value that is to be faked as the minimum value
and this minimum value is raised until the tree matches the
maximum length requirement.
This algorithm is not of excellent performance for very long data blocks,
especially when population counts are longer than 2**tree_limit, but
we are not planning to use this with extremely long blocks.
See http://en.wikipedia.org/wiki/Huffman_coding */
void BrotliCreateHuffmanTree(const uint32_t* data,
const size_t length,
const int tree_limit,
HuffmanTree* tree,
uint8_t* depth) {
uint32_t count_limit;
HuffmanTree sentinel;
InitHuffmanTree(&sentinel, BROTLI_UINT32_MAX, -1, -1);
/* For block sizes below 64 kB, we never need to do a second iteration
of this loop. Probably all of our block sizes will be smaller than
that, so this loop is mostly of academic interest. If we actually
would need this, we would be better off with the Katajainen algorithm. */
for (count_limit = 1; ; count_limit *= 2) {
size_t n = 0;
size_t i;
size_t j;
size_t k;
for (i = length; i != 0;) {
--i;
if (data[i]) {
const uint32_t count = BROTLI_MAX(uint32_t, data[i], count_limit);
InitHuffmanTree(&tree[n++], count, -1, (int16_t)i);
}
}
if (n == 1) {
depth[tree[0].index_right_or_value_] = 1; /* Only one element. */
break;
}
SortHuffmanTreeItems(tree, n, SortHuffmanTree);
/* The nodes are:
[0, n): the sorted leaf nodes that we start with.
[n]: we add a sentinel here.
[n + 1, 2n): new parent nodes are added here, starting from
(n+1). These are naturally in ascending order.
[2n]: we add a sentinel at the end as well.
There will be (2n+1) elements at the end. */
tree[n] = sentinel;
tree[n + 1] = sentinel;
i = 0; /* Points to the next leaf node. */
j = n + 1; /* Points to the next non-leaf node. */
for (k = n - 1; k != 0; --k) {
size_t left, right;
if (tree[i].total_count_ <= tree[j].total_count_) {
left = i;
++i;
} else {
left = j;
++j;
}
if (tree[i].total_count_ <= tree[j].total_count_) {
right = i;
++i;
} else {
right = j;
++j;
}
{
/* The sentinel node becomes the parent node. */
size_t j_end = 2 * n - k;
tree[j_end].total_count_ =
tree[left].total_count_ + tree[right].total_count_;
tree[j_end].index_left_ = (int16_t)left;
tree[j_end].index_right_or_value_ = (int16_t)right;
/* Add back the last sentinel node. */
tree[j_end + 1] = sentinel;
}
}
if (BrotliSetDepth((int)(2 * n - 1), &tree[0], depth, tree_limit)) {
/* We need to pack the Huffman tree in tree_limit bits. If this was not
successful, add fake entities to the lowest values and retry. */
break;
}
}
}
static void Reverse(uint8_t* v, size_t start, size_t end) {
--end;
while (start < end) {
uint8_t tmp = v[start];
v[start] = v[end];
v[end] = tmp;
++start;
--end;
}
}
static void BrotliWriteHuffmanTreeRepetitions(
const uint8_t previous_value,
const uint8_t value,
size_t repetitions,
size_t* tree_size,
uint8_t* tree,
uint8_t* extra_bits_data) {
BROTLI_DCHECK(repetitions > 0);
if (previous_value != value) {
tree[*tree_size] = value;
extra_bits_data[*tree_size] = 0;
++(*tree_size);
--repetitions;
}
if (repetitions == 7) {
tree[*tree_size] = value;
extra_bits_data[*tree_size] = 0;
++(*tree_size);
--repetitions;
}
if (repetitions < 3) {
size_t i;
for (i = 0; i < repetitions; ++i) {
tree[*tree_size] = value;
extra_bits_data[*tree_size] = 0;
++(*tree_size);
}
} else {
size_t start = *tree_size;
repetitions -= 3;
while (BROTLI_TRUE) {
tree[*tree_size] = BROTLI_REPEAT_PREVIOUS_CODE_LENGTH;
extra_bits_data[*tree_size] = repetitions & 0x3;
++(*tree_size);
repetitions >>= 2;
if (repetitions == 0) {
break;
}
--repetitions;
}
Reverse(tree, start, *tree_size);
Reverse(extra_bits_data, start, *tree_size);
}
}
static void BrotliWriteHuffmanTreeRepetitionsZeros(
size_t repetitions,
size_t* tree_size,
uint8_t* tree,
uint8_t* extra_bits_data) {
if (repetitions == 11) {
tree[*tree_size] = 0;
extra_bits_data[*tree_size] = 0;
++(*tree_size);
--repetitions;
}
if (repetitions < 3) {
size_t i;
for (i = 0; i < repetitions; ++i) {
tree[*tree_size] = 0;
extra_bits_data[*tree_size] = 0;
++(*tree_size);
}
} else {
size_t start = *tree_size;
repetitions -= 3;
while (BROTLI_TRUE) {
tree[*tree_size] = BROTLI_REPEAT_ZERO_CODE_LENGTH;
extra_bits_data[*tree_size] = repetitions & 0x7;
++(*tree_size);
repetitions >>= 3;
if (repetitions == 0) {
break;
}
--repetitions;
}
Reverse(tree, start, *tree_size);
Reverse(extra_bits_data, start, *tree_size);
}
}
void BrotliOptimizeHuffmanCountsForRle(size_t length, uint32_t* counts,
uint8_t* good_for_rle) {
size_t nonzero_count = 0;
size_t stride;
size_t limit;
size_t sum;
const size_t streak_limit = 1240;
/* Let's make the Huffman code more compatible with RLE encoding. */
size_t i;
for (i = 0; i < length; i++) {
if (counts[i]) {
++nonzero_count;
}
}
if (nonzero_count < 16) {
return;
}
while (length != 0 && counts[length - 1] == 0) {
--length;
}
if (length == 0) {
return; /* All zeros. */
}
/* Now counts[0..length - 1] does not have trailing zeros. */
{
size_t nonzeros = 0;
uint32_t smallest_nonzero = 1 << 30;
for (i = 0; i < length; ++i) {
if (counts[i] != 0) {
++nonzeros;
if (smallest_nonzero > counts[i]) {
smallest_nonzero = counts[i];
}
}
}
if (nonzeros < 5) {
/* Small histogram will model it well. */
return;
}
if (smallest_nonzero < 4) {
size_t zeros = length - nonzeros;
if (zeros < 6) {
for (i = 1; i < length - 1; ++i) {
if (counts[i - 1] != 0 && counts[i] == 0 && counts[i + 1] != 0) {
counts[i] = 1;
}
}
}
}
if (nonzeros < 28) {
return;
}
}
/* 2) Let's mark all population counts that already can be encoded
with an RLE code. */
memset(good_for_rle, 0, length);
{
/* Let's not spoil any of the existing good RLE codes.
Mark any seq of 0's that is longer as 5 as a good_for_rle.
Mark any seq of non-0's that is longer as 7 as a good_for_rle. */
uint32_t symbol = counts[0];
size_t step = 0;
for (i = 0; i <= length; ++i) {
if (i == length || counts[i] != symbol) {
if ((symbol == 0 && step >= 5) ||
(symbol != 0 && step >= 7)) {
size_t k;
for (k = 0; k < step; ++k) {
good_for_rle[i - k - 1] = 1;
}
}
step = 1;
if (i != length) {
symbol = counts[i];
}
} else {
++step;
}
}
}
/* 3) Let's replace those population counts that lead to more RLE codes.
Math here is in 24.8 fixed point representation. */
stride = 0;
limit = 256 * (counts[0] + counts[1] + counts[2]) / 3 + 420;
sum = 0;
for (i = 0; i <= length; ++i) {
if (i == length || good_for_rle[i] ||
(i != 0 && good_for_rle[i - 1]) ||
(256 * counts[i] - limit + streak_limit) >= 2 * streak_limit) {
if (stride >= 4 || (stride >= 3 && sum == 0)) {
size_t k;
/* The stride must end, collapse what we have, if we have enough (4). */
size_t count = (sum + stride / 2) / stride;
if (count == 0) {
count = 1;
}
if (sum == 0) {
/* Don't make an all zeros stride to be upgraded to ones. */
count = 0;
}
for (k = 0; k < stride; ++k) {
/* We don't want to change value at counts[i],
that is already belonging to the next stride. Thus - 1. */
counts[i - k - 1] = (uint32_t)count;
}
}
stride = 0;
sum = 0;
if (i < length - 2) {
/* All interesting strides have a count of at least 4, */
/* at least when non-zeros. */
limit = 256 * (counts[i] + counts[i + 1] + counts[i + 2]) / 3 + 420;
} else if (i < length) {
limit = 256 * counts[i];
} else {
limit = 0;
}
}
++stride;
if (i != length) {
sum += counts[i];
if (stride >= 4) {
limit = (256 * sum + stride / 2) / stride;
}
if (stride == 4) {
limit += 120;
}
}
}
}
static void DecideOverRleUse(const uint8_t* depth, const size_t length,
BROTLI_BOOL* use_rle_for_non_zero,
BROTLI_BOOL* use_rle_for_zero) {
size_t total_reps_zero = 0;
size_t total_reps_non_zero = 0;
size_t count_reps_zero = 1;
size_t count_reps_non_zero = 1;
size_t i;
for (i = 0; i < length;) {
const uint8_t value = depth[i];
size_t reps = 1;
size_t k;
for (k = i + 1; k < length && depth[k] == value; ++k) {
++reps;
}
if (reps >= 3 && value == 0) {
total_reps_zero += reps;
++count_reps_zero;
}
if (reps >= 4 && value != 0) {
total_reps_non_zero += reps;
++count_reps_non_zero;
}
i += reps;
}
*use_rle_for_non_zero =
TO_BROTLI_BOOL(total_reps_non_zero > count_reps_non_zero * 2);
*use_rle_for_zero = TO_BROTLI_BOOL(total_reps_zero > count_reps_zero * 2);
}
void BrotliWriteHuffmanTree(const uint8_t* depth,
size_t length,
size_t* tree_size,
uint8_t* tree,
uint8_t* extra_bits_data) {
uint8_t previous_value = BROTLI_INITIAL_REPEATED_CODE_LENGTH;
size_t i;
BROTLI_BOOL use_rle_for_non_zero = BROTLI_FALSE;
BROTLI_BOOL use_rle_for_zero = BROTLI_FALSE;
/* Throw away trailing zeros. */
size_t new_length = length;
for (i = 0; i < length; ++i) {
if (depth[length - i - 1] == 0) {
--new_length;
} else {
break;
}
}
/* First gather statistics on if it is a good idea to do RLE. */
if (length > 50) {
/* Find RLE coding for longer codes.
Shorter codes seem not to benefit from RLE. */
DecideOverRleUse(depth, new_length,
&use_rle_for_non_zero, &use_rle_for_zero);
}
/* Actual RLE coding. */
for (i = 0; i < new_length;) {
const uint8_t value = depth[i];
size_t reps = 1;
if ((value != 0 && use_rle_for_non_zero) ||
(value == 0 && use_rle_for_zero)) {
size_t k;
for (k = i + 1; k < new_length && depth[k] == value; ++k) {
++reps;
}
}
if (value == 0) {
BrotliWriteHuffmanTreeRepetitionsZeros(
reps, tree_size, tree, extra_bits_data);
} else {
BrotliWriteHuffmanTreeRepetitions(previous_value,
value, reps, tree_size,
tree, extra_bits_data);
previous_value = value;
}
i += reps;
}
}
static uint16_t BrotliReverseBits(size_t num_bits, uint16_t bits) {
static const size_t BROTLI_MODEL("small") kLut[16] =
{ /* Pre-reversed 4-bit values. */
0x00, 0x08, 0x04, 0x0C, 0x02, 0x0A, 0x06, 0x0E,
0x01, 0x09, 0x05, 0x0D, 0x03, 0x0B, 0x07, 0x0F
};
size_t retval = kLut[bits & 0x0F];
size_t i;
for (i = 4; i < num_bits; i += 4) {
retval <<= 4;
bits = (uint16_t)(bits >> 4);
retval |= kLut[bits & 0x0F];
}
retval >>= ((0 - num_bits) & 0x03);
return (uint16_t)retval;
}
/* 0..15 are values for bits */
#define MAX_HUFFMAN_BITS 16
void BrotliConvertBitDepthsToSymbols(const uint8_t* depth,
size_t len,
uint16_t* bits) {
/* In Brotli, all bit depths are [1..15]
0 bit depth means that the symbol does not exist. */
uint16_t bl_count[MAX_HUFFMAN_BITS] = { 0 };
uint16_t next_code[MAX_HUFFMAN_BITS];
size_t i;
int code = 0;
for (i = 0; i < len; ++i) {
++bl_count[depth[i]];
}
bl_count[0] = 0;
next_code[0] = 0;
for (i = 1; i < MAX_HUFFMAN_BITS; ++i) {
code = (code + bl_count[i - 1]) << 1;
next_code[i] = (uint16_t)code;
}
for (i = 0; i < len; ++i) {
if (depth[i]) {
bits[i] = BrotliReverseBits(depth[i], next_code[depth[i]]++);
}
}
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif

121
c/enc/entropy_encode.h Normal file
View File

@@ -0,0 +1,121 @@
/* Copyright 2010 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Entropy encoding (Huffman) utilities. */
#ifndef BROTLI_ENC_ENTROPY_ENCODE_H_
#define BROTLI_ENC_ENTROPY_ENCODE_H_
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* A node of a Huffman tree. */
typedef struct HuffmanTree {
uint32_t total_count_;
int16_t index_left_;
int16_t index_right_or_value_;
} HuffmanTree;
static BROTLI_INLINE void InitHuffmanTree(HuffmanTree* self, uint32_t count,
int16_t left, int16_t right) {
self->total_count_ = count;
self->index_left_ = left;
self->index_right_or_value_ = right;
}
/* Returns 1 is assignment of depths succeeded, otherwise 0. */
BROTLI_INTERNAL BROTLI_BOOL BrotliSetDepth(
int p, HuffmanTree* pool, uint8_t* depth, int max_depth);
/* This function will create a Huffman tree.
The (data,length) contains the population counts.
The tree_limit is the maximum bit depth of the Huffman codes.
The depth contains the tree, i.e., how many bits are used for
the symbol.
The actual Huffman tree is constructed in the tree[] array, which has to
be at least 2 * length + 1 long.
See http://en.wikipedia.org/wiki/Huffman_coding */
BROTLI_INTERNAL void BrotliCreateHuffmanTree(const uint32_t* data,
const size_t length,
const int tree_limit,
HuffmanTree* tree,
uint8_t* depth);
/* Change the population counts in a way that the consequent
Huffman tree compression, especially its RLE-part will be more
likely to compress this data more efficiently.
length contains the size of the histogram.
counts contains the population counts.
good_for_rle is a buffer of at least length size */
BROTLI_INTERNAL void BrotliOptimizeHuffmanCountsForRle(
size_t length, uint32_t* counts, uint8_t* good_for_rle);
/* Write a Huffman tree from bit depths into the bit-stream representation
of a Huffman tree. The generated Huffman tree is to be compressed once
more using a Huffman tree */
BROTLI_INTERNAL void BrotliWriteHuffmanTree(const uint8_t* depth,
size_t length,
size_t* tree_size,
uint8_t* tree,
uint8_t* extra_bits_data);
/* Get the actual bit values for a tree of bit depths. */
BROTLI_INTERNAL void BrotliConvertBitDepthsToSymbols(const uint8_t* depth,
size_t len,
uint16_t* bits);
BROTLI_INTERNAL extern BROTLI_MODEL("small") const size_t kBrotliShellGaps[6];
/* Input size optimized Shell sort. */
typedef BROTLI_BOOL (*HuffmanTreeComparator)(
const HuffmanTree*, const HuffmanTree*);
static BROTLI_INLINE void SortHuffmanTreeItems(HuffmanTree* items,
const size_t n, HuffmanTreeComparator comparator) {
if (n < 13) {
/* Insertion sort. */
size_t i;
for (i = 1; i < n; ++i) {
HuffmanTree tmp = items[i];
size_t k = i;
size_t j = i - 1;
while (comparator(&tmp, &items[j])) {
items[k] = items[j];
k = j;
if (!j--) break;
}
items[k] = tmp;
}
return;
} else {
/* Shell sort. */
int g = n < 57 ? 2 : 0;
for (; g < 6; ++g) {
size_t gap = kBrotliShellGaps[g];
size_t i;
for (i = gap; i < n; ++i) {
size_t j = i;
HuffmanTree tmp = items[i];
for (; j >= gap && comparator(&tmp, &items[j - gap]); j -= gap) {
items[j] = items[j - gap];
}
items[j] = tmp;
}
}
}
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_ENTROPY_ENCODE_H_ */

View File

@@ -4,22 +4,25 @@
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
// Static entropy codes used for faster meta-block encoding.
/* Static entropy codes used for faster meta-block encoding. */
#ifndef BROTLI_ENC_ENTROPY_ENCODE_STATIC_H_
#define BROTLI_ENC_ENTROPY_ENCODE_STATIC_H_
#include "./prefix.h"
#include "./types.h"
#include "./write_bits.h"
#include "../common/constants.h"
#include "../common/platform.h"
#include "write_bits.h"
namespace brotli {
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
static const uint8_t kCodeLengthDepth[18] = {
static const BROTLI_MODEL("small") uint8_t kCodeLengthDepth[18] = {
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 0, 4, 4,
};
static const uint8_t kStaticCommandCodeDepth[kNumCommandPrefixes] = {
static const BROTLI_MODEL("small")
uint8_t kStaticCommandCodeDepth[BROTLI_NUM_COMMAND_SYMBOLS] = {
9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9,
@@ -66,22 +69,28 @@ static const uint8_t kStaticCommandCodeDepth[kNumCommandPrefixes] = {
11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11,
};
static const uint8_t kStaticDistanceCodeDepth[64] = {
static const BROTLI_MODEL("small")
uint8_t kStaticDistanceCodeDepth[64] = {
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
};
static const uint32_t kCodeLengthBits[18] = {
/* GENERATED CODE START */
static const BROTLI_MODEL("small")
uint32_t kCodeLengthBits[18] = {
0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 15, 31, 0, 11, 7,
};
inline void StoreStaticCodeLengthCode(size_t* storage_ix, uint8_t* storage) {
WriteBits(40, MAKE_UINT64_T(0xff, 0x55555554), storage_ix, storage);
static BROTLI_INLINE void StoreStaticCodeLengthCode(
size_t* storage_ix, uint8_t* storage) {
BrotliWriteBits(
40, BROTLI_MAKE_UINT64_T(0x0000FFu, 0x55555554u), storage_ix, storage);
}
static const uint64_t kZeroRepsBits[704] = {
static const BROTLI_MODEL("small")
uint64_t kZeroRepsBits[BROTLI_NUM_COMMAND_SYMBOLS] = {
0x00000000, 0x00000000, 0x00000000, 0x00000007, 0x00000017, 0x00000027,
0x00000037, 0x00000047, 0x00000057, 0x00000067, 0x00000077, 0x00000770,
0x00000b87, 0x00001387, 0x00001b87, 0x00002387, 0x00002b87, 0x00003387,
@@ -202,7 +211,8 @@ static const uint64_t kZeroRepsBits[704] = {
0x06f9cb87, 0x08f9cb87,
};
static const uint32_t kZeroRepsDepth[704] = {
static const BROTLI_MODEL("small")
uint32_t kZeroRepsDepth[BROTLI_NUM_COMMAND_SYMBOLS] = {
0, 4, 8, 7, 7, 7, 7, 7, 7, 7, 7, 11, 14, 14, 14, 14,
14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14,
@@ -249,7 +259,8 @@ static const uint32_t kZeroRepsDepth[704] = {
28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28,
};
static const uint64_t kNonZeroRepsBits[704] = {
static const BROTLI_MODEL("small")
uint64_t kNonZeroRepsBits[BROTLI_NUM_COMMAND_SYMBOLS] = {
0x0000000b, 0x0000001b, 0x0000002b, 0x0000003b, 0x000002cb, 0x000006cb,
0x00000acb, 0x00000ecb, 0x000002db, 0x000006db, 0x00000adb, 0x00000edb,
0x000002eb, 0x000006eb, 0x00000aeb, 0x00000eeb, 0x000002fb, 0x000006fb,
@@ -370,7 +381,8 @@ static const uint64_t kNonZeroRepsBits[704] = {
0x2baeb6db, 0x3baeb6db,
};
static const uint32_t kNonZeroRepsDepth[704] = {
static const BROTLI_MODEL("small")
uint32_t kNonZeroRepsDepth[BROTLI_NUM_COMMAND_SYMBOLS] = {
6, 6, 6, 6, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
12, 12, 12, 12, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18, 18,
@@ -417,47 +429,8 @@ static const uint32_t kNonZeroRepsDepth[704] = {
30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30,
};
static const uint16_t kStaticLiteralCodeBits[256] = {
0, 128, 64, 192, 32, 160, 96, 224,
16, 144, 80, 208, 48, 176, 112, 240,
8, 136, 72, 200, 40, 168, 104, 232,
24, 152, 88, 216, 56, 184, 120, 248,
4, 132, 68, 196, 36, 164, 100, 228,
20, 148, 84, 212, 52, 180, 116, 244,
12, 140, 76, 204, 44, 172, 108, 236,
28, 156, 92, 220, 60, 188, 124, 252,
2, 130, 66, 194, 34, 162, 98, 226,
18, 146, 82, 210, 50, 178, 114, 242,
10, 138, 74, 202, 42, 170, 106, 234,
26, 154, 90, 218, 58, 186, 122, 250,
6, 134, 70, 198, 38, 166, 102, 230,
22, 150, 86, 214, 54, 182, 118, 246,
14, 142, 78, 206, 46, 174, 110, 238,
30, 158, 94, 222, 62, 190, 126, 254,
1, 129, 65, 193, 33, 161, 97, 225,
17, 145, 81, 209, 49, 177, 113, 241,
9, 137, 73, 201, 41, 169, 105, 233,
25, 153, 89, 217, 57, 185, 121, 249,
5, 133, 69, 197, 37, 165, 101, 229,
21, 149, 85, 213, 53, 181, 117, 245,
13, 141, 77, 205, 45, 173, 109, 237,
29, 157, 93, 221, 61, 189, 125, 253,
3, 131, 67, 195, 35, 163, 99, 227,
19, 147, 83, 211, 51, 179, 115, 243,
11, 139, 75, 203, 43, 171, 107, 235,
27, 155, 91, 219, 59, 187, 123, 251,
7, 135, 71, 199, 39, 167, 103, 231,
23, 151, 87, 215, 55, 183, 119, 247,
15, 143, 79, 207, 47, 175, 111, 239,
31, 159, 95, 223, 63, 191, 127, 255,
};
inline void StoreStaticLiteralHuffmanTree(size_t* storage_ix,
uint8_t* storage) {
WriteBits(32, 0x00010003U, storage_ix, storage);
}
static const uint16_t kStaticCommandCodeBits[kNumCommandPrefixes] = {
static const BROTLI_MODEL("small")
uint16_t kStaticCommandCodeBits[BROTLI_NUM_COMMAND_SYMBOLS] = {
0, 256, 128, 384, 64, 320, 192, 448,
32, 288, 160, 416, 96, 352, 224, 480,
16, 272, 144, 400, 80, 336, 208, 464,
@@ -548,25 +521,28 @@ static const uint16_t kStaticCommandCodeBits[kNumCommandPrefixes] = {
255, 1279, 767, 1791, 511, 1535, 1023, 2047,
};
inline void StoreStaticCommandHuffmanTree(size_t* storage_ix,
uint8_t* storage) {
WriteBits(28, 0x0000000006307003U, storage_ix, storage);
WriteBits(31, 0x0000000009262441U, storage_ix, storage);
static BROTLI_INLINE void StoreStaticCommandHuffmanTree(
size_t* storage_ix, uint8_t* storage) {
BrotliWriteBits(
56, BROTLI_MAKE_UINT64_T(0x926244U, 0x16307003U), storage_ix, storage);
BrotliWriteBits(3, 0x00000000U, storage_ix, storage);
}
static const uint16_t kStaticDistanceCodeBits[64] = {
static const BROTLI_MODEL("small") uint16_t kStaticDistanceCodeBits[64] = {
0, 32, 16, 48, 8, 40, 24, 56, 4, 36, 20, 52, 12, 44, 28, 60,
2, 34, 18, 50, 10, 42, 26, 58, 6, 38, 22, 54, 14, 46, 30, 62,
1, 33, 17, 49, 9, 41, 25, 57, 5, 37, 21, 53, 13, 45, 29, 61,
3, 35, 19, 51, 11, 43, 27, 59, 7, 39, 23, 55, 15, 47, 31, 63,
};
inline void StoreStaticDistanceHuffmanTree(size_t* storage_ix,
uint8_t* storage) {
WriteBits(18, 0x000000000001dc03U, storage_ix, storage);
WriteBits(10, 0x00000000000000daU, storage_ix, storage);
static BROTLI_INLINE void StoreStaticDistanceHuffmanTree(
size_t* storage_ix, uint8_t* storage) {
BrotliWriteBits(28, 0x0369DC03u, storage_ix, storage);
}
/* GENERATED CODE END */
} // namespace brotli
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif // BROTLI_ENC_ENTROPY_ENCODE_STATIC_H_
#endif /* BROTLI_ENC_ENTROPY_ENCODE_STATIC_H_ */

View File

@@ -4,33 +4,14 @@
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
// Utilities for fast computation of logarithms.
#include "fast_log.h"
#ifndef BROTLI_ENC_FAST_LOG_H_
#define BROTLI_ENC_FAST_LOG_H_
#include <assert.h>
#include <math.h>
#include "./types.h"
namespace brotli {
static inline uint32_t Log2FloorNonZero(size_t n) {
#ifdef __GNUC__
return 31u ^ static_cast<uint32_t>(__builtin_clz(static_cast<uint32_t>(n)));
#else
uint32_t result = 0;
while (n >>= 1) result++;
return result;
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
}
// A lookup table for small values of log2(int) to be used in entropy
// computation.
//
// ", ".join(["%.16ff" % x for x in [0.0]+[log2(x) for x in range(1, 256)]])
static const float kLog2Table[] = {
/* ", ".join(["%.16ff" % x for x in [0.0]+[log2(x) for x in range(1, 256)]]) */
const BROTLI_MODEL("small") double kBrotliLog2Table[BROTLI_LOG2_TABLE_SIZE] = {
0.0000000000000000f, 0.0000000000000000f, 1.0000000000000000f,
1.5849625007211563f, 2.0000000000000000f, 2.3219280948873622f,
2.5849625007211561f, 2.8073549220576042f, 3.0000000000000000f,
@@ -119,21 +100,6 @@ static const float kLog2Table[] = {
7.9943534368588578f
};
// Faster logarithm for small integers, with the property of log2(0) == 0.
static inline double FastLog2(size_t v) {
if (v < sizeof(kLog2Table) / sizeof(kLog2Table[0])) {
return kLog2Table[v];
}
#if defined(_MSC_VER) && _MSC_VER <= 1700
// Visual Studio 2012 does not have the log2() function defined, so we use
// log() and a multiplication instead.
static const double kLog2Inv = 1.4426950408889634f;
return log(static_cast<double>(v)) * kLog2Inv;
#else
return log2(static_cast<double>(v));
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
}
} // namespace brotli
#endif // BROTLI_ENC_FAST_LOG_H_

66
c/enc/fast_log.h Normal file
View File

@@ -0,0 +1,66 @@
/* Copyright 2013 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Utilities for fast computation of logarithms. */
#ifndef BROTLI_ENC_FAST_LOG_H_
#define BROTLI_ENC_FAST_LOG_H_
#include <math.h>
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
static BROTLI_INLINE uint32_t Log2FloorNonZero(size_t n) {
#if defined(BROTLI_BSR32)
return BROTLI_BSR32((uint32_t)n);
#else
uint32_t result = 0;
while (n >>= 1) result++;
return result;
#endif
}
#define BROTLI_LOG2_TABLE_SIZE 256
/* A lookup table for small values of log2(int) to be used in entropy
computation. */
BROTLI_INTERNAL extern const BROTLI_MODEL("small")
double kBrotliLog2Table[BROTLI_LOG2_TABLE_SIZE];
/* Visual Studio 2012 and Android API levels < 18 do not have the log2()
* function defined, so we use log() and a multiplication instead. */
#if !defined(BROTLI_HAVE_LOG2)
#if ((defined(_MSC_VER) && _MSC_VER <= 1700) || \
(defined(__ANDROID_API__) && __ANDROID_API__ < 18))
#define BROTLI_HAVE_LOG2 0
#else
#define BROTLI_HAVE_LOG2 1
#endif
#endif
#define LOG_2_INV 1.4426950408889634
/* Faster logarithm for small integers, with the property of log2(0) == 0. */
static BROTLI_INLINE double FastLog2(size_t v) {
if (v < BROTLI_LOG2_TABLE_SIZE) {
return kBrotliLog2Table[v];
}
#if !(BROTLI_HAVE_LOG2)
return log((double)v) * LOG_2_INV;
#else
return log2((double)v);
#endif
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_FAST_LOG_H_ */

70
c/enc/find_match_length.h Normal file
View File

@@ -0,0 +1,70 @@
/* Copyright 2010 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Function to find maximal matching prefixes of strings. */
#ifndef BROTLI_ENC_FIND_MATCH_LENGTH_H_
#define BROTLI_ENC_FIND_MATCH_LENGTH_H_
#include "../common/platform.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
/* Separate implementation for little-endian 64-bit targets, for speed. */
#if defined(BROTLI_TZCNT64) && BROTLI_64_BITS && BROTLI_LITTLE_ENDIAN
static BROTLI_INLINE size_t FindMatchLengthWithLimit(const uint8_t* s1,
const uint8_t* s2,
size_t limit) {
const uint8_t *s1_orig = s1;
for (; limit >= 8; limit -= 8) {
uint64_t x = BROTLI_UNALIGNED_LOAD64LE(s2) ^
BROTLI_UNALIGNED_LOAD64LE(s1);
s2 += 8;
if (x != 0) {
size_t matching_bits = (size_t)BROTLI_TZCNT64(x);
return (size_t)(s1 - s1_orig) + (matching_bits >> 3);
}
s1 += 8;
}
while (limit && *s1 == *s2) {
limit--;
++s2;
++s1;
}
return (size_t)(s1 - s1_orig);
}
#else
static BROTLI_INLINE size_t FindMatchLengthWithLimit(const uint8_t* s1,
const uint8_t* s2,
size_t limit) {
size_t matched = 0;
const uint8_t* s2_limit = s2 + limit;
const uint8_t* s2_ptr = s2;
/* Find out how long the match is. We loop over the data 32 bits at a
time until we find a 32-bit block that doesn't match; then we find
the first non-matching bit and use that to calculate the total
length of the match. */
while (s2_ptr <= s2_limit - 4 &&
BrotliUnalignedRead32(s2_ptr) ==
BrotliUnalignedRead32(s1 + matched)) {
s2_ptr += 4;
matched += 4;
}
while ((s2_ptr < s2_limit) && (s1[matched] == *s2_ptr)) {
++s2_ptr;
++matched;
}
return matched;
}
#endif
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_FIND_MATCH_LENGTH_H_ */

735
c/enc/hash.h Normal file
View File

@@ -0,0 +1,735 @@
/* Copyright 2010 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* A (forgetful) hash table to the data seen by the compressor, to
help create backward references to previous data. */
#ifndef BROTLI_ENC_HASH_H_
#define BROTLI_ENC_HASH_H_
#include "../common/constants.h"
#include "../common/dictionary.h"
#include "../common/platform.h"
#include "compound_dictionary.h"
#include "encoder_dict.h"
#include "fast_log.h"
#include "find_match_length.h"
#include "hash_base.h"
#include "matching_tag_mask.h"
#include "memory.h"
#include "params.h"
#include "quality.h"
#include "static_dict.h"
#if defined(__cplusplus) || defined(c_plusplus)
extern "C" {
#endif
typedef struct {
/**
* Dynamically allocated areas; regular hasher uses one or two allocations;
* "composite" hasher uses up to 4 allocations.
*/
void* extra[4];
/**
* False before the first invocation of HasherSetup (where "extra" memory)
* is allocated.
*/
BROTLI_BOOL is_setup_;
size_t dict_num_lookups;
size_t dict_num_matches;
BrotliHasherParams params;
/**
* False if hasher needs to be "prepared" before use (before the first
* invocation of HasherSetup or after HasherReset). "preparation" is hasher
* data initialization (using input ringbuffer).
*/
BROTLI_BOOL is_prepared_;
} HasherCommon;
#define score_t size_t
static const uint32_t kCutoffTransformsCount = 10;
/* 0, 12, 27, 23, 42, 63, 56, 48, 59, 64 */
/* 0+0, 4+8, 8+19, 12+11, 16+26, 20+43, 24+32, 28+20, 32+27, 36+28 */
static const uint64_t kCutoffTransforms =
BROTLI_MAKE_UINT64_T(0x071B520A, 0xDA2D3200);
typedef struct HasherSearchResult {
size_t len;
size_t distance;
score_t score;
int len_code_delta; /* == len_code - len */
} HasherSearchResult;
static BROTLI_INLINE void PrepareDistanceCache(
int* BROTLI_RESTRICT distance_cache, const int num_distances) {
if (num_distances > 4) {
int last_distance = distance_cache[0];
distance_cache[4] = last_distance - 1;
distance_cache[5] = last_distance + 1;
distance_cache[6] = last_distance - 2;
distance_cache[7] = last_distance + 2;
distance_cache[8] = last_distance - 3;
distance_cache[9] = last_distance + 3;
if (num_distances > 10) {
int next_last_distance = distance_cache[1];
distance_cache[10] = next_last_distance - 1;
distance_cache[11] = next_last_distance + 1;
distance_cache[12] = next_last_distance - 2;
distance_cache[13] = next_last_distance + 2;
distance_cache[14] = next_last_distance - 3;
distance_cache[15] = next_last_distance + 3;
}
}
}
#define BROTLI_LITERAL_BYTE_SCORE 135
#define BROTLI_DISTANCE_BIT_PENALTY 30
/* Score must be positive after applying maximal penalty. */
#define BROTLI_SCORE_BASE (BROTLI_DISTANCE_BIT_PENALTY * 8 * sizeof(size_t))
/* Usually, we always choose the longest backward reference. This function
allows for the exception of that rule.
If we choose a backward reference that is further away, it will
usually be coded with more bits. We approximate this by assuming
log2(distance). If the distance can be expressed in terms of the
last four distances, we use some heuristic constants to estimate
the bits cost. For the first up to four literals we use the bit
cost of the literals from the literal cost model, after that we
use the average bit cost of the cost model.
This function is used to sometimes discard a longer backward reference
when it is not much longer and the bit cost for encoding it is more
than the saved literals.
backward_reference_offset MUST be positive. */
static BROTLI_INLINE score_t BackwardReferenceScore(
size_t copy_length, size_t backward_reference_offset) {
return BROTLI_SCORE_BASE + BROTLI_LITERAL_BYTE_SCORE * (score_t)copy_length -
BROTLI_DISTANCE_BIT_PENALTY * Log2FloorNonZero(backward_reference_offset);
}
static BROTLI_INLINE score_t BackwardReferenceScoreUsingLastDistance(
size_t copy_length) {
return BROTLI_LITERAL_BYTE_SCORE * (score_t)copy_length +
BROTLI_SCORE_BASE + 15;
}
static BROTLI_INLINE score_t BackwardReferencePenaltyUsingLastDistance(
size_t distance_short_code) {
return (score_t)39 + ((0x1CA10 >> (distance_short_code & 0xE)) & 0xE);
}
static BROTLI_INLINE BROTLI_BOOL TestStaticDictionaryItem(
const BrotliEncoderDictionary* dictionary, size_t len, size_t word_idx,
const uint8_t* data, size_t max_length, size_t max_backward,
size_t max_distance, HasherSearchResult* out) {
size_t offset;
size_t matchlen;
size_t backward;
score_t score;
offset = dictionary->words->offsets_by_length[len] + len * word_idx;
if (len > max_length) {
return BROTLI_FALSE;
}
matchlen =
FindMatchLengthWithLimit(data, &dictionary->words->data[offset], len);
if (matchlen + dictionary->cutoffTransformsCount <= len || matchlen == 0) {
return BROTLI_FALSE;
}
{
size_t cut = len - matchlen;
size_t transform_id = (cut << 2) +
(size_t)((dictionary->cutoffTransforms >> (cut * 6)) & 0x3F);
backward = max_backward + 1 + word_idx +
(transform_id << dictionary->words->size_bits_by_length[len]);
}
if (backward > max_distance) {
return BROTLI_FALSE;
}
score = BackwardReferenceScore(matchlen, backward);
if (score < out->score) {
return BROTLI_FALSE;
}
out->len = matchlen;
out->len_code_delta = (int)len - (int)matchlen;
out->distance = backward;
out->score = score;
return BROTLI_TRUE;
}
static BROTLI_INLINE void SearchInStaticDictionary(
const BrotliEncoderDictionary* dictionary,
HasherCommon* common, const uint8_t* data, size_t max_length,
size_t max_backward, size_t max_distance,
HasherSearchResult* out, BROTLI_BOOL shallow) {
size_t key;
size_t i;
if (common->dict_num_matches < (common->dict_num_lookups >> 7)) {
return;
}
key = Hash14(data) << 1;
for (i = 0; i < (shallow ? 1u : 2u); ++i, ++key) {
common->dict_num_lookups++;
if (dictionary->hash_table_lengths[key] != 0) {
BROTLI_BOOL item_matches = TestStaticDictionaryItem(
dictionary, dictionary->hash_table_lengths[key],
dictionary->hash_table_words[key], data,
max_length, max_backward, max_distance, out);
if (item_matches) {
common->dict_num_matches++;
}
}
}
}
typedef struct BackwardMatch {
uint32_t distance;
uint32_t length_and_code;
} BackwardMatch;
static BROTLI_INLINE void InitBackwardMatch(BackwardMatch* self,
size_t dist, size_t len) {
self->distance = (uint32_t)dist;
self->length_and_code = (uint32_t)(len << 5);
}
static BROTLI_INLINE void InitDictionaryBackwardMatch(BackwardMatch* self,
size_t dist, size_t len, size_t len_code) {
self->distance = (uint32_t)dist;
self->length_and_code =
(uint32_t)((len << 5) | (len == len_code ? 0 : len_code));
}
static BROTLI_INLINE size_t BackwardMatchLength(const BackwardMatch* self) {
return self->length_and_code >> 5;
}
static BROTLI_INLINE size_t BackwardMatchLengthCode(const BackwardMatch* self) {
size_t code = self->length_and_code & 31;
return code ? code : BackwardMatchLength(self);
}
#define EXPAND_CAT(a, b) CAT(a, b)
#define CAT(a, b) a ## b
#define FN(X) EXPAND_CAT(X, HASHER())
#define HASHER() H10
#define BUCKET_BITS 17
#define MAX_TREE_SEARCH_DEPTH 64
#define MAX_TREE_COMP_LENGTH 128
#include "hash_to_binary_tree_inc.h" /* NOLINT(build/include) */
#undef MAX_TREE_SEARCH_DEPTH
#undef MAX_TREE_COMP_LENGTH
#undef BUCKET_BITS
#undef HASHER
/* MAX_NUM_MATCHES == 64 + MAX_TREE_SEARCH_DEPTH */
#define MAX_NUM_MATCHES_H10 128
/* For BUCKET_SWEEP_BITS == 0, enabling the dictionary lookup makes compression
a little faster (0.5% - 1%) and it compresses 0.15% better on small text
and HTML inputs. */
#define HASHER() H2
#define BUCKET_BITS 16
#define BUCKET_SWEEP_BITS 0
#define HASH_LEN 5
#define USE_DICTIONARY 1
#include "hash_longest_match_quickly_inc.h" /* NOLINT(build/include) */
#undef BUCKET_SWEEP_BITS
#undef USE_DICTIONARY
#undef HASHER
#define HASHER() H3
#define BUCKET_SWEEP_BITS 1
#define USE_DICTIONARY 0
#include "hash_longest_match_quickly_inc.h" /* NOLINT(build/include) */
#undef USE_DICTIONARY
#undef BUCKET_SWEEP_BITS
#undef BUCKET_BITS
#undef HASHER
#define HASHER() H4
#define BUCKET_BITS 17
#define BUCKET_SWEEP_BITS 2
#define USE_DICTIONARY 1
#include "hash_longest_match_quickly_inc.h" /* NOLINT(build/include) */
#undef USE_DICTIONARY
#undef HASH_LEN
#undef BUCKET_SWEEP_BITS
#undef BUCKET_BITS
#undef HASHER
#define HASHER() H5
#include "hash_longest_match_inc.h" /* NOLINT(build/include) */
#undef HASHER
#define HASHER() H6
#include "hash_longest_match64_inc.h" /* NOLINT(build/include) */
#undef HASHER
#if defined(BROTLI_MAX_SIMD_QUALITY)
#define HASHER() H58
#include "hash_longest_match_simd_inc.h" /* NOLINT(build/include) */
#undef HASHER
#define HASHER() H68
#include "hash_longest_match64_simd_inc.h" /* NOLINT(build/include) */
#undef HASHER
#endif
#define BUCKET_BITS 15
#define NUM_LAST_DISTANCES_TO_CHECK 4
#define NUM_BANKS 1
#define BANK_BITS 16
#define HASHER() H40
#include "hash_forgetful_chain_inc.h" /* NOLINT(build/include) */
#undef HASHER
#undef NUM_LAST_DISTANCES_TO_CHECK
#define NUM_LAST_DISTANCES_TO_CHECK 10
#define HASHER() H41
#include "hash_forgetful_chain_inc.h" /* NOLINT(build/include) */
#undef HASHER
#undef NUM_LAST_DISTANCES_TO_CHECK
#undef NUM_BANKS
#undef BANK_BITS
#define NUM_LAST_DISTANCES_TO_CHECK 16
#define NUM_BANKS 512
#define BANK_BITS 9
#define HASHER() H42
#include "hash_forgetful_chain_inc.h" /* NOLINT(build/include) */
#undef HASHER
#undef NUM_LAST_DISTANCES_TO_CHECK
#undef NUM_BANKS
#undef BANK_BITS
#undef BUCKET_BITS
#define HASHER() H54
#define BUCKET_BITS 20
#define BUCKET_SWEEP_BITS 2
#define HASH_LEN 7
#define USE_DICTIONARY 0
#include "hash_longest_match_quickly_inc.h" /* NOLINT(build/include) */
#undef USE_DICTIONARY
#undef HASH_LEN
#undef BUCKET_SWEEP_BITS
#undef BUCKET_BITS
#undef HASHER
/* fast large window hashers */
#define HASHER() HROLLING_FAST
#define CHUNKLEN 32
#define JUMP 4
#define NUMBUCKETS 16777216
#define MASK ((NUMBUCKETS * 64) - 1)
#include "hash_rolling_inc.h" /* NOLINT(build/include) */
#undef JUMP
#undef HASHER
#define HASHER() HROLLING
#define JUMP 1
#include "hash_rolling_inc.h" /* NOLINT(build/include) */
#undef MASK
#undef NUMBUCKETS
#undef JUMP
#undef CHUNKLEN
#undef HASHER
#define HASHER() H35
#define HASHER_A H3
#define HASHER_B HROLLING_FAST
#include "hash_composite_inc.h" /* NOLINT(build/include) */
#undef HASHER_A
#undef HASHER_B
#undef HASHER
#define HASHER() H55
#define HASHER_A H54
#define HASHER_B HROLLING_FAST
#include "hash_composite_inc.h" /* NOLINT(build/include) */
#undef HASHER_A
#undef HASHER_B
#undef HASHER
#define HASHER() H65
#define HASHER_A H6
#define HASHER_B HROLLING
#include "hash_composite_inc.h" /* NOLINT(build/include) */
#undef HASHER_A
#undef HASHER_B
#undef HASHER
#undef FN
#undef CAT
#undef EXPAND_CAT
#if defined(BROTLI_MAX_SIMD_QUALITY)
#define FOR_SIMPLE_HASHERS(H) \
H(2) H(3) H(4) H(5) H(6) H(40) H(41) H(42) H(54) H(58) H(68)
#else
#define FOR_SIMPLE_HASHERS(H) \
H(2) H(3) H(4) H(5) H(6) H(40) H(41) H(42) H(54)
#endif
#define FOR_COMPOSITE_HASHERS(H) H(35) H(55) H(65)
#define FOR_GENERIC_HASHERS(H) FOR_SIMPLE_HASHERS(H) FOR_COMPOSITE_HASHERS(H)
#define FOR_ALL_HASHERS(H) FOR_GENERIC_HASHERS(H) H(10)
typedef struct {
HasherCommon common;
union {
#define MEMBER_(N) \
H ## N _H ## N;
FOR_ALL_HASHERS(MEMBER_)
#undef MEMBER_
} privat;
} Hasher;
/* MUST be invoked before any other method. */
static BROTLI_INLINE void HasherInit(Hasher* hasher) {
hasher->common.is_setup_ = BROTLI_FALSE;
hasher->common.extra[0] = NULL;
hasher->common.extra[1] = NULL;
hasher->common.extra[2] = NULL;
hasher->common.extra[3] = NULL;
}
static BROTLI_INLINE void DestroyHasher(MemoryManager* m, Hasher* hasher) {
if (hasher->common.extra[0] != NULL) BROTLI_FREE(m, hasher->common.extra[0]);
if (hasher->common.extra[1] != NULL) BROTLI_FREE(m, hasher->common.extra[1]);
if (hasher->common.extra[2] != NULL) BROTLI_FREE(m, hasher->common.extra[2]);
if (hasher->common.extra[3] != NULL) BROTLI_FREE(m, hasher->common.extra[3]);
}
static BROTLI_INLINE void HasherReset(Hasher* hasher) {
hasher->common.is_prepared_ = BROTLI_FALSE;
}
static BROTLI_INLINE void HasherSize(const BrotliEncoderParams* params,
BROTLI_BOOL one_shot, const size_t input_size, size_t* alloc_size) {
switch (params->hasher.type) {
#define SIZE_(N) \
case N: \
HashMemAllocInBytesH ## N(params, one_shot, input_size, alloc_size); \
break;
FOR_ALL_HASHERS(SIZE_)
#undef SIZE_
default:
break;
}
}
static BROTLI_INLINE void HasherSetup(MemoryManager* m, Hasher* hasher,
BrotliEncoderParams* params, const uint8_t* data, size_t position,
size_t input_size, BROTLI_BOOL is_last) {
BROTLI_BOOL one_shot = (position == 0 && is_last);
if (!hasher->common.is_setup_) {
size_t alloc_size[4] = {0};
size_t i;
ChooseHasher(params, &params->hasher);
hasher->common.params = params->hasher;
hasher->common.dict_num_lookups = 0;
hasher->common.dict_num_matches = 0;
HasherSize(params, one_shot, input_size, alloc_size);
for (i = 0; i < 4; ++i) {
if (alloc_size[i] == 0) continue;
hasher->common.extra[i] = BROTLI_ALLOC(m, uint8_t, alloc_size[i]);
if (BROTLI_IS_OOM(m) || BROTLI_IS_NULL(hasher->common.extra[i])) return;
}
switch (hasher->common.params.type) {
#define INITIALIZE_(N) \
case N: \
InitializeH ## N(&hasher->common, \
&hasher->privat._H ## N, params); \
break;
FOR_ALL_HASHERS(INITIALIZE_);
#undef INITIALIZE_
default:
break;
}
HasherReset(hasher);
hasher->common.is_setup_ = BROTLI_TRUE;
}
if (!hasher->common.is_prepared_) {
switch (hasher->common.params.type) {
#define PREPARE_(N) \
case N: \
PrepareH ## N( \
&hasher->privat._H ## N, \
one_shot, input_size, data); \
break;
FOR_ALL_HASHERS(PREPARE_)
#undef PREPARE_
default: break;
}
hasher->common.is_prepared_ = BROTLI_TRUE;
}
}
static BROTLI_INLINE void InitOrStitchToPreviousBlock(
MemoryManager* m, Hasher* hasher, const uint8_t* data, size_t mask,
BrotliEncoderParams* params, size_t position, size_t input_size,
BROTLI_BOOL is_last) {
HasherSetup(m, hasher, params, data, position, input_size, is_last);
if (BROTLI_IS_OOM(m)) return;
switch (hasher->common.params.type) {
#define INIT_(N) \
case N: \
StitchToPreviousBlockH ## N( \
&hasher->privat._H ## N, \
input_size, position, data, mask); \
break;
FOR_ALL_HASHERS(INIT_)
#undef INIT_
default: break;
}
}
/* NB: when seamless dictionary-ring-buffer copies are implemented, don't forget
to add proper guards for non-zero-BROTLI_PARAM_STREAM_OFFSET. */
static BROTLI_INLINE void FindCompoundDictionaryMatch(
const PreparedDictionary* self, const uint8_t* BROTLI_RESTRICT data,
const size_t ring_buffer_mask, const int* BROTLI_RESTRICT distance_cache,
const size_t cur_ix, const size_t max_length, const size_t distance_offset,
const size_t max_distance, HasherSearchResult* BROTLI_RESTRICT out) {
const uint32_t source_size = self->source_size;
const size_t boundary = distance_offset - source_size;
const uint32_t hash_bits = self->hash_bits;
const uint32_t bucket_bits = self->bucket_bits;
const uint32_t slot_bits = self->slot_bits;
const uint32_t hash_shift = 64u - bucket_bits;
const uint32_t slot_mask = (~((uint32_t)0U)) >> (32 - slot_bits);
const uint64_t hash_mask = (~((uint64_t)0U)) >> (64 - hash_bits);
const uint32_t* slot_offsets = (uint32_t*)(&self[1]);
const uint16_t* heads = (uint16_t*)(&slot_offsets[(size_t)1u << slot_bits]);
const uint32_t* items = (uint32_t*)(&heads[(size_t)1u << bucket_bits]);
const uint8_t* source = NULL;
const size_t cur_ix_masked = cur_ix & ring_buffer_mask;
score_t best_score = out->score;
size_t best_len = out->len;
size_t i;
const uint64_t h =
(BROTLI_UNALIGNED_LOAD64LE(&data[cur_ix_masked]) & hash_mask) *
kPreparedDictionaryHashMul64Long;
const uint32_t key = (uint32_t)(h >> hash_shift);
const uint32_t slot = key & slot_mask;
const uint32_t head = heads[key];
const uint32_t* BROTLI_RESTRICT chain = &items[slot_offsets[slot] + head];
uint32_t item = (head == 0xFFFF) ? 1 : 0;
const void* tail = (void*)&items[self->num_items];
if (self->magic == kPreparedDictionaryMagic) {
source = (const uint8_t*)tail;
} else {
/* kLeanPreparedDictionaryMagic */
source = (const uint8_t*)BROTLI_UNALIGNED_LOAD_PTR((const uint8_t**)tail);
}
BROTLI_DCHECK(cur_ix_masked + max_length <= ring_buffer_mask);
for (i = 0; i < 4; ++i) {
const size_t distance = (size_t)distance_cache[i];
size_t offset;
size_t limit;
size_t len;
if (distance <= boundary || distance > distance_offset) continue;
offset = distance_offset - distance;
limit = source_size - offset;
limit = limit > max_length ? max_length : limit;
len = FindMatchLengthWithLimit(&source[offset], &data[cur_ix_masked],
limit);
if (len >= 2) {
score_t score = BackwardReferenceScoreUsingLastDistance(len);
if (best_score < score) {
if (i != 0) score -= BackwardReferencePenaltyUsingLastDistance(i);
if (best_score < score) {
best_score = score;
if (len > best_len) best_len = len;
out->len = len;
out->len_code_delta = 0;
out->distance = distance;
out->score = best_score;
}
}
}
}
/* we require matches of len >4, so increase best_len to 3, so we can compare
* 4 bytes all the time. */
if (best_len < 3) {
best_len = 3;
}
while (item == 0) {
size_t offset;
size_t distance;
size_t limit;
item = *chain;
chain++;
offset = item & 0x7FFFFFFF;
item &= 0x80000000;
distance = distance_offset - offset;
limit = source_size - offset;
limit = (limit > max_length) ? max_length : limit;
if (distance > max_distance) continue;
if (cur_ix_masked + best_len > ring_buffer_mask || best_len >= limit ||
/* compare 4 bytes ending at best_len + 1 */
BrotliUnalignedRead32(&data[cur_ix_masked + best_len - 3]) !=
BrotliUnalignedRead32(&source[offset + best_len - 3])) {
continue;
}
{
const size_t len = FindMatchLengthWithLimit(&source[offset],
&data[cur_ix_masked],
limit);
if (len >= 4) {
score_t score = BackwardReferenceScore(len, distance);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->len_code_delta = 0;
out->distance = distance;
out->score = best_score;
}
}
}
}
}
/* NB: when seamless dictionary-ring-buffer copies are implemented, don't forget
to add proper guards for non-zero-BROTLI_PARAM_STREAM_OFFSET. */
static BROTLI_INLINE size_t FindAllCompoundDictionaryMatches(
const PreparedDictionary* self, const uint8_t* BROTLI_RESTRICT data,
const size_t ring_buffer_mask, const size_t cur_ix, const size_t min_length,
const size_t max_length, const size_t distance_offset,
const size_t max_distance, BackwardMatch* matches, size_t match_limit) {
const uint32_t source_size = self->source_size;
const uint32_t hash_bits = self->hash_bits;
const uint32_t bucket_bits = self->bucket_bits;
const uint32_t slot_bits = self->slot_bits;
const uint32_t hash_shift = 64u - bucket_bits;
const uint32_t slot_mask = (~((uint32_t)0U)) >> (32 - slot_bits);
const uint64_t hash_mask = (~((uint64_t)0U)) >> (64 - hash_bits);
const uint32_t* slot_offsets = (uint32_t*)(&self[1]);
const uint16_t* heads = (uint16_t*)(&slot_offsets[(size_t)1u << slot_bits]);
const uint32_t* items = (uint32_t*)(&heads[(size_t)1u << bucket_bits]);
const uint8_t* source = NULL;
const size_t cur_ix_masked = cur_ix & ring_buffer_mask;
size_t best_len = min_length;
const uint64_t h =
(BROTLI_UNALIGNED_LOAD64LE(&data[cur_ix_masked]) & hash_mask) *
kPreparedDictionaryHashMul64Long;
const uint32_t key = (uint32_t)(h >> hash_shift);
const uint32_t slot = key & slot_mask;
const uint32_t head = heads[key];
const uint32_t* BROTLI_RESTRICT chain = &items[slot_offsets[slot] + head];
uint32_t item = (head == 0xFFFF) ? 1 : 0;
size_t found = 0;
const void* tail = (void*)&items[self->num_items];
if (self->magic == kPreparedDictionaryMagic) {
source = (const uint8_t*)tail;
} else {
/* kLeanPreparedDictionaryMagic */
source = (const uint8_t*)BROTLI_UNALIGNED_LOAD_PTR((const uint8_t**)tail);
}
BROTLI_DCHECK(cur_ix_masked + max_length <= ring_buffer_mask);
while (item == 0) {
size_t offset;
size_t distance;
size_t limit;
size_t len;
item = *chain;
chain++;
offset = item & 0x7FFFFFFF;
item &= 0x80000000;
distance = distance_offset - offset;
limit = source_size - offset;
limit = (limit > max_length) ? max_length : limit;
if (distance > max_distance) continue;
if (cur_ix_masked + best_len > ring_buffer_mask ||
best_len >= limit ||
data[cur_ix_masked + best_len] != source[offset + best_len]) {
continue;
}
len = FindMatchLengthWithLimit(
&source[offset], &data[cur_ix_masked], limit);
if (len > best_len) {
best_len = len;
InitBackwardMatch(matches++, distance, len);
found++;
if (found == match_limit) break;
}
}
return found;
}
static BROTLI_INLINE void LookupCompoundDictionaryMatch(
const CompoundDictionary* addon, const uint8_t* BROTLI_RESTRICT data,
const size_t ring_buffer_mask, const int* BROTLI_RESTRICT distance_cache,
const size_t cur_ix, const size_t max_length,
const size_t max_ring_buffer_distance, const size_t max_distance,
HasherSearchResult* sr) {
size_t base_offset = max_ring_buffer_distance + 1 + addon->total_size - 1;
size_t d;
for (d = 0; d < addon->num_chunks; ++d) {
/* Only one prepared dictionary type is currently supported. */
FindCompoundDictionaryMatch(
(const PreparedDictionary*)addon->chunks[d], data, ring_buffer_mask,
distance_cache, cur_ix, max_length,
base_offset - addon->chunk_offsets[d], max_distance, sr);
}
}
static BROTLI_INLINE size_t LookupAllCompoundDictionaryMatches(
const CompoundDictionary* addon, const uint8_t* BROTLI_RESTRICT data,
const size_t ring_buffer_mask, const size_t cur_ix, size_t min_length,
const size_t max_length, const size_t max_ring_buffer_distance,
const size_t max_distance, BackwardMatch* matches,
size_t match_limit) {
size_t base_offset = max_ring_buffer_distance + 1 + addon->total_size - 1;
size_t d;
size_t total_found = 0;
for (d = 0; d < addon->num_chunks; ++d) {
/* Only one prepared dictionary type is currently supported. */
total_found += FindAllCompoundDictionaryMatches(
(const PreparedDictionary*)addon->chunks[d], data, ring_buffer_mask,
cur_ix, min_length, max_length, base_offset - addon->chunk_offsets[d],
max_distance, matches + total_found, match_limit - total_found);
if (total_found == match_limit) break;
if (total_found > 0) {
min_length = BackwardMatchLength(&matches[total_found - 1]);
}
}
return total_found;
}
#if defined(__cplusplus) || defined(c_plusplus)
} /* extern "C" */
#endif
#endif /* BROTLI_ENC_HASH_H_ */

38
c/enc/hash_base.h Normal file
View File

@@ -0,0 +1,38 @@
/* Copyright 2025 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* Basic common hash functions / constants. */
#ifndef THIRD_PARTY_BROTLI_ENC_HASH_BASE_H_
#define THIRD_PARTY_BROTLI_ENC_HASH_BASE_H_
#include "../common/platform.h"
/* kHashMul32 multiplier has these properties:
* The multiplier must be odd. Otherwise we may lose the highest bit.
* No long streaks of ones or zeros.
* There is no effort to ensure that it is a prime, the oddity is enough
for this use.
* The number has been tuned heuristically against compression benchmarks. */
static const uint32_t kHashMul32 = 0x1E35A7BD;
static const uint64_t kHashMul64 =
BROTLI_MAKE_UINT64_T(0x1FE35A7Bu, 0xD3579BD3u);
static BROTLI_INLINE uint32_t Hash14(const uint8_t* data) {
uint32_t h = BROTLI_UNALIGNED_LOAD32LE(data) * kHashMul32;
/* The higher bits contain more mixture from the multiplication,
so we take our results from there. */
return h >> (32 - 14);
}
static BROTLI_INLINE uint32_t Hash15(const uint8_t* data) {
uint32_t h = BROTLI_UNALIGNED_LOAD32LE(data) * kHashMul32;
/* The higher bits contain more mixture from the multiplication,
so we take our results from there. */
return h >> (32 - 15);
}
#endif // THIRD_PARTY_BROTLI_ENC_HASH_BASE_H_

140
c/enc/hash_composite_inc.h Normal file
View File

@@ -0,0 +1,140 @@
/* NOLINT(build/header_guard) */
/* Copyright 2018 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN, HASHER_A, HASHER_B */
/* Composite hasher: This hasher allows to combine two other hashers, HASHER_A
and HASHER_B. */
#define HashComposite HASHER()
#define FN_A(X) EXPAND_CAT(X, HASHER_A)
#define FN_B(X) EXPAND_CAT(X, HASHER_B)
static BROTLI_INLINE size_t FN(HashTypeLength)(void) {
size_t a = FN_A(HashTypeLength)();
size_t b = FN_B(HashTypeLength)();
return a > b ? a : b;
}
static BROTLI_INLINE size_t FN(StoreLookahead)(void) {
size_t a = FN_A(StoreLookahead)();
size_t b = FN_B(StoreLookahead)();
return a > b ? a : b;
}
typedef struct HashComposite {
HASHER_A ha;
HASHER_B hb;
HasherCommon ha_common;
HasherCommon hb_common;
/* Shortcuts. */
HasherCommon* common;
BROTLI_BOOL fresh;
const BrotliEncoderParams* params;
} HashComposite;
static void FN(Initialize)(HasherCommon* common,
HashComposite* BROTLI_RESTRICT self, const BrotliEncoderParams* params) {
self->common = common;
self->ha_common = *self->common;
self->hb_common = *self->common;
self->fresh = BROTLI_TRUE;
self->params = params;
/* TODO(lode): Initialize of the hashers is deferred to Prepare (and params
remembered here) because we don't get the one_shot and input_size params
here that are needed to know the memory size of them. Instead provide
those params to all hashers FN(Initialize) */
}
static void FN(Prepare)(
HashComposite* BROTLI_RESTRICT self, BROTLI_BOOL one_shot,
size_t input_size, const uint8_t* BROTLI_RESTRICT data) {
if (self->fresh) {
self->fresh = BROTLI_FALSE;
self->ha_common.extra[0] = self->common->extra[0];
self->ha_common.extra[1] = self->common->extra[1];
self->ha_common.extra[2] = NULL;
self->ha_common.extra[3] = NULL;
self->hb_common.extra[0] = self->common->extra[2];
self->hb_common.extra[1] = self->common->extra[3];
self->hb_common.extra[2] = NULL;
self->hb_common.extra[3] = NULL;
FN_A(Initialize)(&self->ha_common, &self->ha, self->params);
FN_B(Initialize)(&self->hb_common, &self->hb, self->params);
}
FN_A(Prepare)(&self->ha, one_shot, input_size, data);
FN_B(Prepare)(&self->hb, one_shot, input_size, data);
}
static BROTLI_INLINE void FN(HashMemAllocInBytes)(
const BrotliEncoderParams* params, BROTLI_BOOL one_shot,
size_t input_size, size_t* alloc_size) {
size_t alloc_size_a[4] = {0};
size_t alloc_size_b[4] = {0};
FN_A(HashMemAllocInBytes)(params, one_shot, input_size, alloc_size_a);
FN_B(HashMemAllocInBytes)(params, one_shot, input_size, alloc_size_b);
/* Should never happen. */
if (alloc_size_a[2] != 0 || alloc_size_a[3] != 0) exit(EXIT_FAILURE);
if (alloc_size_b[2] != 0 || alloc_size_b[3] != 0) exit(EXIT_FAILURE);
alloc_size[0] = alloc_size_a[0];
alloc_size[1] = alloc_size_a[1];
alloc_size[2] = alloc_size_b[0];
alloc_size[3] = alloc_size_b[1];
}
static BROTLI_INLINE void FN(Store)(HashComposite* BROTLI_RESTRICT self,
const uint8_t* BROTLI_RESTRICT data, const size_t mask, const size_t ix) {
FN_A(Store)(&self->ha, data, mask, ix);
FN_B(Store)(&self->hb, data, mask, ix);
}
static BROTLI_INLINE void FN(StoreRange)(
HashComposite* BROTLI_RESTRICT self, const uint8_t* BROTLI_RESTRICT data,
const size_t mask, const size_t ix_start,
const size_t ix_end) {
FN_A(StoreRange)(&self->ha, data, mask, ix_start, ix_end);
FN_B(StoreRange)(&self->hb, data, mask, ix_start, ix_end);
}
static BROTLI_INLINE void FN(StitchToPreviousBlock)(
HashComposite* BROTLI_RESTRICT self,
size_t num_bytes, size_t position, const uint8_t* ringbuffer,
size_t ring_buffer_mask) {
FN_A(StitchToPreviousBlock)(&self->ha, num_bytes, position,
ringbuffer, ring_buffer_mask);
FN_B(StitchToPreviousBlock)(&self->hb, num_bytes, position,
ringbuffer, ring_buffer_mask);
}
static BROTLI_INLINE void FN(PrepareDistanceCache)(
HashComposite* BROTLI_RESTRICT self, int* BROTLI_RESTRICT distance_cache) {
FN_A(PrepareDistanceCache)(&self->ha, distance_cache);
FN_B(PrepareDistanceCache)(&self->hb, distance_cache);
}
static BROTLI_INLINE void FN(FindLongestMatch)(
HashComposite* BROTLI_RESTRICT self,
const BrotliEncoderDictionary* dictionary,
const uint8_t* BROTLI_RESTRICT data, const size_t ring_buffer_mask,
const int* BROTLI_RESTRICT distance_cache, const size_t cur_ix,
const size_t max_length, const size_t max_backward,
const size_t dictionary_distance, const size_t max_distance,
HasherSearchResult* BROTLI_RESTRICT out) {
FN_A(FindLongestMatch)(&self->ha, dictionary, data, ring_buffer_mask,
distance_cache, cur_ix, max_length, max_backward, dictionary_distance,
max_distance, out);
FN_B(FindLongestMatch)(&self->hb, dictionary, data, ring_buffer_mask,
distance_cache, cur_ix, max_length, max_backward, dictionary_distance,
max_distance, out);
}
#undef HashComposite

View File

@@ -0,0 +1,305 @@
/* NOLINT(build/header_guard) */
/* Copyright 2016 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN, BUCKET_BITS, NUM_BANKS, BANK_BITS,
NUM_LAST_DISTANCES_TO_CHECK */
/* A (forgetful) hash table to the data seen by the compressor, to
help create backward references to previous data.
Hashes are stored in chains which are bucketed to groups. Group of chains
share a storage "bank". When more than "bank size" chain nodes are added,
oldest nodes are replaced; this way several chains may share a tail. */
#define HashForgetfulChain HASHER()
#define BANK_SIZE (1 << BANK_BITS)
/* Number of hash buckets. */
#define BUCKET_SIZE (1 << BUCKET_BITS)
#define CAPPED_CHAINS 0
static BROTLI_INLINE size_t FN(HashTypeLength)(void) { return 4; }
static BROTLI_INLINE size_t FN(StoreLookahead)(void) { return 4; }
/* HashBytes is the function that chooses the bucket to place the address in.*/
static BROTLI_INLINE size_t FN(HashBytes)(const uint8_t* BROTLI_RESTRICT data) {
const uint32_t h = BROTLI_UNALIGNED_LOAD32LE(data) * kHashMul32;
/* The higher bits contain more mixture from the multiplication,
so we take our results from there. */
return h >> (32 - BUCKET_BITS);
}
typedef struct FN(Slot) {
uint16_t delta;
uint16_t next;
} FN(Slot);
typedef struct FN(Bank) {
FN(Slot) slots[BANK_SIZE];
} FN(Bank);
typedef struct HashForgetfulChain {
uint16_t free_slot_idx[NUM_BANKS]; /* Up to 1KiB. Move to dynamic? */
size_t max_hops;
/* Shortcuts. */
void* extra[2];
HasherCommon* common;
/* --- Dynamic size members --- */
/* uint32_t addr[BUCKET_SIZE]; */
/* uint16_t head[BUCKET_SIZE]; */
/* Truncated hash used for quick rejection of "distance cache" candidates. */
/* uint8_t tiny_hash[65536];*/
/* FN(Bank) banks[NUM_BANKS]; */
} HashForgetfulChain;
static uint32_t* FN(Addr)(void* extra) {
return (uint32_t*)extra;
}
static uint16_t* FN(Head)(void* extra) {
return (uint16_t*)(&FN(Addr)(extra)[BUCKET_SIZE]);
}
static uint8_t* FN(TinyHash)(void* extra) {
return (uint8_t*)(&FN(Head)(extra)[BUCKET_SIZE]);
}
static FN(Bank)* FN(Banks)(void* extra) {
return (FN(Bank)*)(extra);
}
static void FN(Initialize)(
HasherCommon* common, HashForgetfulChain* BROTLI_RESTRICT self,
const BrotliEncoderParams* params) {
self->common = common;
self->extra[0] = common->extra[0];
self->extra[1] = common->extra[1];
self->max_hops = (params->quality > 6 ? 7u : 8u) << (params->quality - 4);
}
static void FN(Prepare)(
HashForgetfulChain* BROTLI_RESTRICT self, BROTLI_BOOL one_shot,
size_t input_size, const uint8_t* BROTLI_RESTRICT data) {
uint32_t* BROTLI_RESTRICT addr = FN(Addr)(self->extra[0]);
uint16_t* BROTLI_RESTRICT head = FN(Head)(self->extra[0]);
uint8_t* BROTLI_RESTRICT tiny_hash = FN(TinyHash)(self->extra[0]);
/* Partial preparation is 100 times slower (per socket). */
size_t partial_prepare_threshold = BUCKET_SIZE >> 6;
if (one_shot && input_size <= partial_prepare_threshold) {
size_t i;
for (i = 0; i < input_size; ++i) {
size_t bucket = FN(HashBytes)(&data[i]);
/* See InitEmpty comment. */
addr[bucket] = 0xCCCCCCCC;
head[bucket] = 0xCCCC;
}
} else {
/* Fill |addr| array with 0xCCCCCCCC value. Because of wrapping, position
processed by hasher never reaches 3GB + 64M; this makes all new chains
to be terminated after the first node. */
memset(addr, 0xCC, sizeof(uint32_t) * BUCKET_SIZE);
memset(head, 0, sizeof(uint16_t) * BUCKET_SIZE);
}
memset(tiny_hash, 0, sizeof(uint8_t) * 65536);
memset(self->free_slot_idx, 0, sizeof(self->free_slot_idx));
}
static BROTLI_INLINE void FN(HashMemAllocInBytes)(
const BrotliEncoderParams* params, BROTLI_BOOL one_shot,
size_t input_size, size_t* alloc_size) {
BROTLI_UNUSED(params);
BROTLI_UNUSED(one_shot);
BROTLI_UNUSED(input_size);
alloc_size[0] = sizeof(uint32_t) * BUCKET_SIZE +
sizeof(uint16_t) * BUCKET_SIZE + sizeof(uint8_t) * 65536;
alloc_size[1] = sizeof(FN(Bank)) * NUM_BANKS;
}
/* Look at 4 bytes at &data[ix & mask]. Compute a hash from these, and prepend
node to corresponding chain; also update tiny_hash for current position. */
static BROTLI_INLINE void FN(Store)(HashForgetfulChain* BROTLI_RESTRICT self,
const uint8_t* BROTLI_RESTRICT data, const size_t mask, const size_t ix) {
uint32_t* BROTLI_RESTRICT addr = FN(Addr)(self->extra[0]);
uint16_t* BROTLI_RESTRICT head = FN(Head)(self->extra[0]);
uint8_t* BROTLI_RESTRICT tiny_hash = FN(TinyHash)(self->extra[0]);
FN(Bank)* BROTLI_RESTRICT banks = FN(Banks)(self->extra[1]);
const size_t key = FN(HashBytes)(&data[ix & mask]);
const size_t bank = key & (NUM_BANKS - 1);
const size_t idx = self->free_slot_idx[bank]++ & (BANK_SIZE - 1);
size_t delta = ix - addr[key];
tiny_hash[(uint16_t)ix] = (uint8_t)key;
if (delta > 0xFFFF) delta = CAPPED_CHAINS ? 0 : 0xFFFF;
banks[bank].slots[idx].delta = (uint16_t)delta;
banks[bank].slots[idx].next = head[key];
addr[key] = (uint32_t)ix;
head[key] = (uint16_t)idx;
}
static BROTLI_INLINE void FN(StoreRange)(
HashForgetfulChain* BROTLI_RESTRICT self,
const uint8_t* BROTLI_RESTRICT data, const size_t mask,
const size_t ix_start, const size_t ix_end) {
size_t i;
for (i = ix_start; i < ix_end; ++i) {
FN(Store)(self, data, mask, i);
}
}
static BROTLI_INLINE void FN(StitchToPreviousBlock)(
HashForgetfulChain* BROTLI_RESTRICT self,
size_t num_bytes, size_t position, const uint8_t* ringbuffer,
size_t ring_buffer_mask) {
if (num_bytes >= FN(HashTypeLength)() - 1 && position >= 3) {
/* Prepare the hashes for three last bytes of the last write.
These could not be calculated before, since they require knowledge
of both the previous and the current block. */
FN(Store)(self, ringbuffer, ring_buffer_mask, position - 3);
FN(Store)(self, ringbuffer, ring_buffer_mask, position - 2);
FN(Store)(self, ringbuffer, ring_buffer_mask, position - 1);
}
}
static BROTLI_INLINE void FN(PrepareDistanceCache)(
HashForgetfulChain* BROTLI_RESTRICT self,
int* BROTLI_RESTRICT distance_cache) {
BROTLI_UNUSED(self);
PrepareDistanceCache(distance_cache, NUM_LAST_DISTANCES_TO_CHECK);
}
/* Find a longest backward match of &data[cur_ix] up to the length of
max_length and stores the position cur_ix in the hash table.
REQUIRES: FN(PrepareDistanceCache) must be invoked for current distance cache
values; if this method is invoked repeatedly with the same distance
cache values, it is enough to invoke FN(PrepareDistanceCache) once.
Does not look for matches longer than max_length.
Does not look for matches further away than max_backward.
Writes the best match into |out|.
|out|->score is updated only if a better match is found. */
static BROTLI_INLINE void FN(FindLongestMatch)(
HashForgetfulChain* BROTLI_RESTRICT self,
const BrotliEncoderDictionary* dictionary,
const uint8_t* BROTLI_RESTRICT data, const size_t ring_buffer_mask,
const int* BROTLI_RESTRICT distance_cache,
const size_t cur_ix, const size_t max_length, const size_t max_backward,
const size_t dictionary_distance, const size_t max_distance,
HasherSearchResult* BROTLI_RESTRICT out) {
uint32_t* BROTLI_RESTRICT addr = FN(Addr)(self->extra[0]);
uint16_t* BROTLI_RESTRICT head = FN(Head)(self->extra[0]);
uint8_t* BROTLI_RESTRICT tiny_hashes = FN(TinyHash)(self->extra[0]);
FN(Bank)* BROTLI_RESTRICT banks = FN(Banks)(self->extra[1]);
const size_t cur_ix_masked = cur_ix & ring_buffer_mask;
/* Don't accept a short copy from far away. */
score_t min_score = out->score;
score_t best_score = out->score;
size_t best_len = out->len;
size_t i;
const size_t key = FN(HashBytes)(&data[cur_ix_masked]);
const uint8_t tiny_hash = (uint8_t)(key);
out->len = 0;
out->len_code_delta = 0;
BROTLI_DCHECK(cur_ix_masked + max_length <= ring_buffer_mask);
/* Try last distance first. */
for (i = 0; i < NUM_LAST_DISTANCES_TO_CHECK; ++i) {
const size_t backward = (size_t)distance_cache[i];
size_t prev_ix = (cur_ix - backward);
/* For distance code 0 we want to consider 2-byte matches. */
if (i > 0 && tiny_hashes[(uint16_t)prev_ix] != tiny_hash) continue;
if (prev_ix >= cur_ix || backward > max_backward) {
continue;
}
prev_ix &= ring_buffer_mask;
{
const size_t len = FindMatchLengthWithLimit(&data[prev_ix],
&data[cur_ix_masked],
max_length);
if (len >= 2) {
score_t score = BackwardReferenceScoreUsingLastDistance(len);
if (best_score < score) {
if (i != 0) score -= BackwardReferencePenaltyUsingLastDistance(i);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->distance = backward;
out->score = best_score;
}
}
}
}
}
/* we require matches of len >4, so increase best_len to 3, so we can compare
* 4 bytes all the time. */
if (best_len < 3) {
best_len = 3;
}
{
const size_t bank = key & (NUM_BANKS - 1);
size_t backward = 0;
size_t hops = self->max_hops;
size_t delta = cur_ix - addr[key];
size_t slot = head[key];
while (hops--) {
size_t prev_ix;
size_t last = slot;
backward += delta;
if (backward > max_backward || (CAPPED_CHAINS && !delta)) break;
prev_ix = (cur_ix - backward) & ring_buffer_mask;
slot = banks[bank].slots[last].next;
delta = banks[bank].slots[last].delta;
if (cur_ix_masked + best_len > ring_buffer_mask ||
prev_ix + best_len > ring_buffer_mask ||
/* compare 4 bytes ending at best_len + 1 */
BrotliUnalignedRead32(&data[cur_ix_masked + best_len - 3]) !=
BrotliUnalignedRead32(&data[prev_ix + best_len - 3])) {
continue;
}
{
const size_t len = FindMatchLengthWithLimit(&data[prev_ix],
&data[cur_ix_masked],
max_length);
if (len >= 4) {
/* Comparing for >= 3 does not change the semantics, but just saves
for a few unnecessary binary logarithms in backward reference
score, since we are not interested in such short matches. */
score_t score = BackwardReferenceScore(len, backward);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->distance = backward;
out->score = best_score;
}
}
}
}
FN(Store)(self, data, ring_buffer_mask, cur_ix);
}
if (out->score == min_score) {
SearchInStaticDictionary(dictionary,
self->common, &data[cur_ix_masked], max_length, dictionary_distance,
max_distance, out, BROTLI_FALSE);
}
}
#undef BANK_SIZE
#undef BUCKET_SIZE
#undef CAPPED_CHAINS
#undef HashForgetfulChain

View File

@@ -0,0 +1,279 @@
/* NOLINT(build/header_guard) */
/* Copyright 2010 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN */
/* A (forgetful) hash table to the data seen by the compressor, to
help create backward references to previous data.
This is a hash map of fixed size (bucket_size_) to a ring buffer of
fixed size (block_size_). The ring buffer contains the last block_size_
index positions of the given hash key in the compressed data. */
#define HashLongestMatch HASHER()
static BROTLI_INLINE size_t FN(HashTypeLength)(void) { return 8; }
static BROTLI_INLINE size_t FN(StoreLookahead)(void) { return 8; }
/* HashBytes is the function that chooses the bucket to place the address in. */
static BROTLI_INLINE size_t FN(HashBytes)(const uint8_t* BROTLI_RESTRICT data,
uint64_t hash_mul) {
const uint64_t h = BROTLI_UNALIGNED_LOAD64LE(data) * hash_mul;
/* The higher bits contain more mixture from the multiplication,
so we take our results from there. */
return (size_t)(h >> (64 - 15));
}
typedef struct HashLongestMatch {
/* Number of hash buckets. */
size_t bucket_size_;
/* Only block_size_ newest backward references are kept,
and the older are forgotten. */
size_t block_size_;
/* Hash multiplier tuned to match length. */
uint64_t hash_mul_;
/* Mask for accessing entries in a block (in a ring-buffer manner). */
uint32_t block_mask_;
int block_bits_;
int num_last_distances_to_check_;
/* Shortcuts. */
HasherCommon* common_;
/* --- Dynamic size members --- */
/* Number of entries in a particular bucket. */
uint16_t* num_; /* uint16_t[bucket_size]; */
/* Buckets containing block_size_ of backward references. */
uint32_t* buckets_; /* uint32_t[bucket_size * block_size]; */
} HashLongestMatch;
static void FN(Initialize)(
HasherCommon* common, HashLongestMatch* BROTLI_RESTRICT self,
const BrotliEncoderParams* params) {
self->common_ = common;
BROTLI_UNUSED(params);
self->hash_mul_ = kHashMul64 << (64 - 5 * 8);
BROTLI_DCHECK(common->params.bucket_bits == 15);
self->bucket_size_ = (size_t)1 << common->params.bucket_bits;
self->block_bits_ = common->params.block_bits;
self->block_size_ = (size_t)1 << common->params.block_bits;
self->block_mask_ = (uint32_t)(self->block_size_ - 1);
self->num_last_distances_to_check_ =
common->params.num_last_distances_to_check;
self->num_ = (uint16_t*)common->extra[0];
self->buckets_ = (uint32_t*)common->extra[1];
}
static void FN(Prepare)(
HashLongestMatch* BROTLI_RESTRICT self, BROTLI_BOOL one_shot,
size_t input_size, const uint8_t* BROTLI_RESTRICT data) {
uint16_t* BROTLI_RESTRICT num = self->num_;
/* Partial preparation is 100 times slower (per socket). */
size_t partial_prepare_threshold = self->bucket_size_ >> 6;
if (one_shot && input_size <= partial_prepare_threshold) {
size_t i;
for (i = 0; i < input_size; ++i) {
const size_t key = FN(HashBytes)(&data[i], self->hash_mul_);
num[key] = 0;
}
} else {
memset(num, 0, self->bucket_size_ * sizeof(num[0]));
}
}
static BROTLI_INLINE void FN(HashMemAllocInBytes)(
const BrotliEncoderParams* params, BROTLI_BOOL one_shot,
size_t input_size, size_t* alloc_size) {
size_t bucket_size = (size_t)1 << params->hasher.bucket_bits;
size_t block_size = (size_t)1 << params->hasher.block_bits;
BROTLI_UNUSED(one_shot);
BROTLI_UNUSED(input_size);
alloc_size[0] = sizeof(uint16_t) * bucket_size;
alloc_size[1] = sizeof(uint32_t) * bucket_size * block_size;
}
/* Look at 4 bytes at &data[ix & mask].
Compute a hash from these, and store the value of ix at that position. */
static BROTLI_INLINE void FN(Store)(
HashLongestMatch* BROTLI_RESTRICT self, const uint8_t* BROTLI_RESTRICT data,
const size_t mask, const size_t ix) {
uint16_t* BROTLI_RESTRICT num = self->num_;
uint32_t* BROTLI_RESTRICT buckets = self->buckets_;
const size_t key = FN(HashBytes)(&data[ix & mask], self->hash_mul_);
const size_t minor_ix = num[key] & self->block_mask_;
const size_t offset = minor_ix + (key << self->block_bits_);
++num[key];
buckets[offset] = (uint32_t)ix;
}
static BROTLI_INLINE void FN(StoreRange)(HashLongestMatch* BROTLI_RESTRICT self,
const uint8_t* BROTLI_RESTRICT data, const size_t mask,
const size_t ix_start, const size_t ix_end) {
size_t i;
for (i = ix_start; i < ix_end; ++i) {
FN(Store)(self, data, mask, i);
}
}
static BROTLI_INLINE void FN(StitchToPreviousBlock)(
HashLongestMatch* BROTLI_RESTRICT self,
size_t num_bytes, size_t position, const uint8_t* ringbuffer,
size_t ringbuffer_mask) {
if (num_bytes >= FN(HashTypeLength)() - 1 && position >= 3) {
/* Prepare the hashes for three last bytes of the last write.
These could not be calculated before, since they require knowledge
of both the previous and the current block. */
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 3);
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 2);
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 1);
}
}
static BROTLI_INLINE void FN(PrepareDistanceCache)(
HashLongestMatch* BROTLI_RESTRICT self,
int* BROTLI_RESTRICT distance_cache) {
PrepareDistanceCache(distance_cache, self->num_last_distances_to_check_);
}
/* Find a longest backward match of &data[cur_ix] up to the length of
max_length and stores the position cur_ix in the hash table.
REQUIRES: FN(PrepareDistanceCache) must be invoked for current distance cache
values; if this method is invoked repeatedly with the same distance
cache values, it is enough to invoke FN(PrepareDistanceCache) once.
Does not look for matches longer than max_length.
Does not look for matches further away than max_backward.
Writes the best match into |out|.
|out|->score is updated only if a better match is found. */
static BROTLI_INLINE void FN(FindLongestMatch)(
HashLongestMatch* BROTLI_RESTRICT self,
const BrotliEncoderDictionary* dictionary,
const uint8_t* BROTLI_RESTRICT data, const size_t ring_buffer_mask,
const int* BROTLI_RESTRICT distance_cache, const size_t cur_ix,
const size_t max_length, const size_t max_backward,
const size_t dictionary_distance, const size_t max_distance,
HasherSearchResult* BROTLI_RESTRICT out) {
uint16_t* BROTLI_RESTRICT num = self->num_;
uint32_t* BROTLI_RESTRICT buckets = self->buckets_;
const size_t cur_ix_masked = cur_ix & ring_buffer_mask;
/* Don't accept a short copy from far away. */
score_t min_score = out->score;
score_t best_score = out->score;
size_t best_len = out->len;
size_t i;
/* Precalculate the hash key and prefetch the bucket. */
const size_t key = FN(HashBytes)(&data[cur_ix_masked], self->hash_mul_);
uint32_t* BROTLI_RESTRICT bucket = &buckets[key << self->block_bits_];
PREFETCH_L1(bucket);
if (self->block_bits_ > 4) PREFETCH_L1(bucket + 16);
out->len = 0;
out->len_code_delta = 0;
BROTLI_DCHECK(cur_ix_masked + max_length <= ring_buffer_mask);
/* Try last distance first. */
for (i = 0; i < (size_t)self->num_last_distances_to_check_; ++i) {
const size_t backward = (size_t)distance_cache[i];
size_t prev_ix = (size_t)(cur_ix - backward);
if (prev_ix >= cur_ix) {
continue;
}
if (BROTLI_PREDICT_FALSE(backward > max_backward)) {
continue;
}
prev_ix &= ring_buffer_mask;
if (cur_ix_masked + best_len > ring_buffer_mask) {
break;
}
if (prev_ix + best_len > ring_buffer_mask ||
data[cur_ix_masked + best_len] != data[prev_ix + best_len]) {
continue;
}
{
const size_t len = FindMatchLengthWithLimit(&data[prev_ix],
&data[cur_ix_masked],
max_length);
if (len >= 3 || (len == 2 && i < 2)) {
/* Comparing for >= 2 does not change the semantics, but just saves for
a few unnecessary binary logarithms in backward reference score,
since we are not interested in such short matches. */
score_t score = BackwardReferenceScoreUsingLastDistance(len);
if (best_score < score) {
if (i != 0) score -= BackwardReferencePenaltyUsingLastDistance(i);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->distance = backward;
out->score = best_score;
}
}
}
}
}
/* we require matches of len >4, so increase best_len to 3, so we can compare
* 4 bytes all the time. */
if (best_len < 3) {
best_len = 3;
}
{
const size_t down =
(num[key] > self->block_size_) ?
(num[key] - self->block_size_) : 0u;
const uint32_t first4 = BrotliUnalignedRead32(data + cur_ix_masked);
const size_t max_length_m4 = max_length - 4;
i = num[key];
for (; i > down;) {
size_t prev_ix = bucket[--i & self->block_mask_];
uint32_t current4;
const size_t backward = cur_ix - prev_ix;
if (BROTLI_PREDICT_FALSE(backward > max_backward)) {
break;
}
prev_ix &= ring_buffer_mask;
if (cur_ix_masked + best_len > ring_buffer_mask) {
break;
}
if (prev_ix + best_len > ring_buffer_mask ||
/* compare 4 bytes ending at best_len + 1 */
BrotliUnalignedRead32(&data[cur_ix_masked + best_len - 3]) !=
BrotliUnalignedRead32(&data[prev_ix + best_len - 3])) {
continue;
}
current4 = BrotliUnalignedRead32(data + prev_ix);
if (first4 != current4) continue;
{
const size_t len = FindMatchLengthWithLimit(&data[prev_ix + 4],
&data[cur_ix_masked + 4],
max_length_m4) + 4;
const score_t score = BackwardReferenceScore(len, backward);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->distance = backward;
out->score = best_score;
}
}
}
bucket[num[key] & self->block_mask_] = (uint32_t)cur_ix;
++num[key];
}
if (min_score == out->score) {
SearchInStaticDictionary(dictionary,
self->common_, &data[cur_ix_masked], max_length, dictionary_distance,
max_distance, out, BROTLI_FALSE);
}
}
#undef HashLongestMatch

View File

@@ -0,0 +1,304 @@
/* NOLINT(build/header_guard) */
/* Copyright 2010 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN */
/* A (forgetful) hash table to the data seen by the compressor, to
help create backward references to previous data.
This is a hash map of fixed size (bucket_size_) to a ring buffer of
fixed size (block_size_). The ring buffer contains the last block_size_
index positions of the given hash key in the compressed data. */
#define HashLongestMatch HASHER()
#define TAG_HASH_BITS 8
#define TAG_HASH_MASK ((1 << TAG_HASH_BITS) - 1)
static BROTLI_INLINE size_t FN(HashTypeLength)(void) { return 8; }
static BROTLI_INLINE size_t FN(StoreLookahead)(void) { return 8; }
/* HashBytes is the function that chooses the bucket to place the address in. */
static BROTLI_INLINE size_t FN(HashBytes)(const uint8_t* BROTLI_RESTRICT data,
uint64_t hash_mul) {
const uint64_t h = BROTLI_UNALIGNED_LOAD64LE(data) * hash_mul;
/* The higher bits contain more mixture from the multiplication,
so we take our results from there. */
return (size_t)(h >> (64 - 15 - TAG_HASH_BITS));
}
typedef struct HashLongestMatch {
/* Number of hash buckets. */
size_t bucket_size_;
/* Only block_size_ newest backward references are kept,
and the older are forgotten. */
size_t block_size_;
/* Hash multiplier tuned to match length. */
uint64_t hash_mul_;
/* Mask for accessing entries in a block (in a ring-buffer manner). */
uint32_t block_mask_;
int block_bits_;
int num_last_distances_to_check_;
/* Shortcuts. */
HasherCommon* common_;
/* --- Dynamic size members --- */
/* Number of entries in a particular bucket. */
uint16_t* num_; /* uint16_t[bucket_size]; */
uint8_t* tags_;
/* Buckets containing block_size_ of backward references. */
uint32_t* buckets_; /* uint32_t[bucket_size * block_size]; */
} HashLongestMatch;
static void FN(Initialize)(
HasherCommon* common, HashLongestMatch* BROTLI_RESTRICT self,
const BrotliEncoderParams* params) {
self->common_ = common;
BROTLI_UNUSED(params);
self->hash_mul_ = kHashMul64 << (64 - 5 * 8);
BROTLI_DCHECK(common->params.bucket_bits == 15);
self->bucket_size_ = (size_t)1 << common->params.bucket_bits;
self->block_bits_ = common->params.block_bits;
self->block_size_ = (size_t)1 << common->params.block_bits;
self->block_mask_ = (uint32_t)(self->block_size_ - 1);
self->num_last_distances_to_check_ =
common->params.num_last_distances_to_check;
self->num_ = (uint16_t*)common->extra[0];
self->tags_ = (uint8_t*)common->extra[1];
self->buckets_ = (uint32_t*)common->extra[2];
}
static void FN(Prepare)(
HashLongestMatch* BROTLI_RESTRICT self, BROTLI_BOOL one_shot,
size_t input_size, const uint8_t* BROTLI_RESTRICT data) {
uint16_t* BROTLI_RESTRICT num = self->num_;
/* Partial preparation is 100 times slower (per socket). */
size_t partial_prepare_threshold = self->bucket_size_ >> 6;
if (one_shot && input_size <= partial_prepare_threshold) {
size_t i;
for (i = 0; i < input_size; ++i) {
const size_t hash = FN(HashBytes)(&data[i], self->hash_mul_);
const size_t key = hash >> TAG_HASH_BITS;
num[key] = 65535;
}
} else {
/* Set all the bytes of num to 255, which makes each uint16_t 65535. */
memset(num, 255, self->bucket_size_ * sizeof(num[0]));
}
}
static BROTLI_INLINE void FN(HashMemAllocInBytes)(
const BrotliEncoderParams* params, BROTLI_BOOL one_shot,
size_t input_size, size_t* alloc_size) {
size_t bucket_size = (size_t)1 << params->hasher.bucket_bits;
size_t block_size = (size_t)1 << params->hasher.block_bits;
BROTLI_UNUSED(one_shot);
BROTLI_UNUSED(input_size);
alloc_size[0] = sizeof(uint16_t) * bucket_size;
alloc_size[1] = sizeof(uint8_t) * bucket_size * block_size;
alloc_size[2] = sizeof(uint32_t) * bucket_size * block_size;
}
/* Look at 4 bytes at &data[ix & mask].
Compute a hash from these, and store the value of ix at that position. */
static BROTLI_INLINE void FN(Store)(
HashLongestMatch* BROTLI_RESTRICT self, const uint8_t* BROTLI_RESTRICT data,
const size_t mask, const size_t ix) {
uint16_t* BROTLI_RESTRICT num = self->num_;
uint8_t* BROTLI_RESTRICT tags = self->tags_;
uint32_t* BROTLI_RESTRICT buckets = self->buckets_;
const size_t hash = FN(HashBytes)(&data[ix & mask], self->hash_mul_);
const size_t key = hash >> TAG_HASH_BITS;
const uint8_t tag = hash & TAG_HASH_MASK;
const size_t minor_ix = num[key] & self->block_mask_;
const size_t offset = minor_ix + (key << self->block_bits_);
--num[key];
buckets[offset] = (uint32_t)ix;
tags[offset] = tag;
}
static BROTLI_INLINE void FN(StoreRange)(HashLongestMatch* BROTLI_RESTRICT self,
const uint8_t* BROTLI_RESTRICT data, const size_t mask,
const size_t ix_start, const size_t ix_end) {
size_t i;
for (i = ix_start; i < ix_end; ++i) {
FN(Store)(self, data, mask, i);
}
}
static BROTLI_INLINE void FN(StitchToPreviousBlock)(
HashLongestMatch* BROTLI_RESTRICT self,
size_t num_bytes, size_t position, const uint8_t* ringbuffer,
size_t ringbuffer_mask) {
if (num_bytes >= FN(HashTypeLength)() - 1 && position >= 3) {
/* Prepare the hashes for three last bytes of the last write.
These could not be calculated before, since they require knowledge
of both the previous and the current block. */
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 3);
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 2);
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 1);
}
}
static BROTLI_INLINE void FN(PrepareDistanceCache)(
HashLongestMatch* BROTLI_RESTRICT self,
int* BROTLI_RESTRICT distance_cache) {
PrepareDistanceCache(distance_cache, self->num_last_distances_to_check_);
}
/* Find a longest backward match of &data[cur_ix] up to the length of
max_length and stores the position cur_ix in the hash table.
REQUIRES: FN(PrepareDistanceCache) must be invoked for current distance cache
values; if this method is invoked repeatedly with the same distance
cache values, it is enough to invoke FN(PrepareDistanceCache) once.
Does not look for matches longer than max_length.
Does not look for matches further away than max_backward.
Writes the best match into |out|.
|out|->score is updated only if a better match is found. */
static BROTLI_INLINE void FN(FindLongestMatch)(
HashLongestMatch* BROTLI_RESTRICT self,
const BrotliEncoderDictionary* dictionary,
const uint8_t* BROTLI_RESTRICT data, const size_t ring_buffer_mask,
const int* BROTLI_RESTRICT distance_cache, const size_t cur_ix,
const size_t max_length, const size_t max_backward,
const size_t dictionary_distance, const size_t max_distance,
HasherSearchResult* BROTLI_RESTRICT out) {
uint16_t* BROTLI_RESTRICT num = self->num_;
uint32_t* BROTLI_RESTRICT buckets = self->buckets_;
uint8_t* BROTLI_RESTRICT tags = self->tags_;
const size_t cur_ix_masked = cur_ix & ring_buffer_mask;
/* Don't accept a short copy from far away. */
score_t min_score = out->score;
score_t best_score = out->score;
size_t best_len = out->len;
size_t i;
/* Precalculate the hash key and prefetch the bucket. */
const size_t hash = FN(HashBytes)(&data[cur_ix_masked], self->hash_mul_);
const size_t key = hash >> TAG_HASH_BITS;
uint32_t* BROTLI_RESTRICT bucket = &buckets[key << self->block_bits_];
uint8_t* BROTLI_RESTRICT tag_bucket = &tags[key << self->block_bits_];
PREFETCH_L1(bucket);
PREFETCH_L1(tag_bucket);
if (self->block_bits_ > 4) PREFETCH_L1(bucket + 16);
out->len = 0;
out->len_code_delta = 0;
BROTLI_DCHECK(cur_ix_masked + max_length <= ring_buffer_mask);
/* Try last distance first. */
for (i = 0; i < (size_t)self->num_last_distances_to_check_; ++i) {
const size_t backward = (size_t)distance_cache[i];
size_t prev_ix = (size_t)(cur_ix - backward);
if (prev_ix >= cur_ix) {
continue;
}
if (BROTLI_PREDICT_FALSE(backward > max_backward)) {
continue;
}
prev_ix &= ring_buffer_mask;
if (cur_ix_masked + best_len > ring_buffer_mask) {
break;
}
if (prev_ix + best_len > ring_buffer_mask ||
data[cur_ix_masked + best_len] != data[prev_ix + best_len]) {
continue;
}
{
const size_t len = FindMatchLengthWithLimit(&data[prev_ix],
&data[cur_ix_masked],
max_length);
if (len >= 3 || (len == 2 && i < 2)) {
/* Comparing for >= 2 does not change the semantics, but just saves for
a few unnecessary binary logarithms in backward reference score,
since we are not interested in such short matches. */
score_t score = BackwardReferenceScoreUsingLastDistance(len);
if (best_score < score) {
if (i != 0) score -= BackwardReferencePenaltyUsingLastDistance(i);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->distance = backward;
out->score = best_score;
}
}
}
}
}
/* we require matches of len >4, so increase best_len to 3, so we can compare
* 4 bytes all the time. */
if (best_len < 3) {
best_len = 3;
}
{
const uint8_t tag = hash & TAG_HASH_MASK;
const uint32_t first4 = BrotliUnalignedRead32(data + cur_ix_masked);
const size_t max_length_m4 = max_length - 4;
const size_t head = (num[key] + 1) & self->block_mask_;
uint64_t matches =
GetMatchingTagMask(self->block_size_ / 16, tag, tag_bucket, head);
/* Mask off any matches from uninitialized tags. */
uint16_t n = 65535 - num[key];
uint64_t block_has_unused_slots = self->block_size_ > n;
uint64_t mask = (block_has_unused_slots << (n & (64 - 1))) - 1;
matches &= mask;
for (; matches > 0; matches &= (matches - 1)) {
const size_t rb_index =
(head + (size_t)BROTLI_TZCNT64(matches)) & self->block_mask_;
size_t prev_ix = bucket[rb_index];
uint32_t current4;
const size_t backward = cur_ix - prev_ix;
if (BROTLI_PREDICT_FALSE(backward > max_backward)) {
break;
}
prev_ix &= ring_buffer_mask;
if (cur_ix_masked + best_len > ring_buffer_mask) {
break;
}
if (prev_ix + best_len > ring_buffer_mask ||
/* compare 4 bytes ending at best_len + 1 */
BrotliUnalignedRead32(&data[cur_ix_masked + best_len - 3]) !=
BrotliUnalignedRead32(&data[prev_ix + best_len - 3])) {
continue;
}
current4 = BrotliUnalignedRead32(data + prev_ix);
if (first4 != current4) continue;
{
const size_t len = FindMatchLengthWithLimit(&data[prev_ix + 4],
&data[cur_ix_masked + 4],
max_length_m4) + 4;
const score_t score = BackwardReferenceScore(len, backward);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->distance = backward;
out->score = best_score;
}
}
}
bucket[num[key] & self->block_mask_] = (uint32_t)cur_ix;
tag_bucket[num[key] & self->block_mask_] = tag;
--num[key];
}
if (min_score == out->score) {
SearchInStaticDictionary(dictionary,
self->common_, &data[cur_ix_masked], max_length, dictionary_distance,
max_distance, out, BROTLI_FALSE);
}
}
#undef HashLongestMatch

View File

@@ -0,0 +1,277 @@
/* NOLINT(build/header_guard) */
/* Copyright 2010 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN */
/* A (forgetful) hash table to the data seen by the compressor, to
help create backward references to previous data.
This is a hash map of fixed size (bucket_size_) to a ring buffer of
fixed size (block_size_). The ring buffer contains the last block_size_
index positions of the given hash key in the compressed data. */
#define HashLongestMatch HASHER()
static BROTLI_INLINE size_t FN(HashTypeLength)(void) { return 4; }
static BROTLI_INLINE size_t FN(StoreLookahead)(void) { return 4; }
/* HashBytes is the function that chooses the bucket to place the address in. */
static uint32_t FN(HashBytes)(
const uint8_t* BROTLI_RESTRICT data, const int shift) {
uint32_t h = BROTLI_UNALIGNED_LOAD32LE(data) * kHashMul32;
/* The higher bits contain more mixture from the multiplication,
so we take our results from there. */
return (uint32_t)(h >> shift);
}
typedef struct HashLongestMatch {
/* Number of hash buckets. */
size_t bucket_size_;
/* Only block_size_ newest backward references are kept,
and the older are forgotten. */
size_t block_size_;
/* Left-shift for computing hash bucket index from hash value. */
int hash_shift_;
/* Mask for accessing entries in a block (in a ring-buffer manner). */
uint32_t block_mask_;
int block_bits_;
int num_last_distances_to_check_;
/* Shortcuts. */
HasherCommon* common_;
/* --- Dynamic size members --- */
/* Number of entries in a particular bucket. */
uint16_t* num_; /* uint16_t[bucket_size]; */
/* Buckets containing block_size_ of backward references. */
uint32_t* buckets_; /* uint32_t[bucket_size * block_size]; */
} HashLongestMatch;
static void FN(Initialize)(
HasherCommon* common, HashLongestMatch* BROTLI_RESTRICT self,
const BrotliEncoderParams* params) {
self->common_ = common;
BROTLI_UNUSED(params);
self->hash_shift_ = 32 - common->params.bucket_bits;
self->bucket_size_ = (size_t)1 << common->params.bucket_bits;
self->block_size_ = (size_t)1 << common->params.block_bits;
self->block_mask_ = (uint32_t)(self->block_size_ - 1);
self->num_ = (uint16_t*)common->extra[0];
self->buckets_ = (uint32_t*)common->extra[1];
self->block_bits_ = common->params.block_bits;
self->num_last_distances_to_check_ =
common->params.num_last_distances_to_check;
}
static void FN(Prepare)(
HashLongestMatch* BROTLI_RESTRICT self, BROTLI_BOOL one_shot,
size_t input_size, const uint8_t* BROTLI_RESTRICT data) {
uint16_t* BROTLI_RESTRICT num = self->num_;
/* Partial preparation is 100 times slower (per socket). */
size_t partial_prepare_threshold = self->bucket_size_ >> 6;
if (one_shot && input_size <= partial_prepare_threshold) {
size_t i;
for (i = 0; i < input_size; ++i) {
const uint32_t key = FN(HashBytes)(&data[i], self->hash_shift_);
num[key] = 0;
}
} else {
memset(num, 0, self->bucket_size_ * sizeof(num[0]));
}
}
static BROTLI_INLINE void FN(HashMemAllocInBytes)(
const BrotliEncoderParams* params, BROTLI_BOOL one_shot,
size_t input_size, size_t* alloc_size) {
size_t bucket_size = (size_t)1 << params->hasher.bucket_bits;
size_t block_size = (size_t)1 << params->hasher.block_bits;
BROTLI_UNUSED(one_shot);
BROTLI_UNUSED(input_size);
alloc_size[0] = sizeof(uint16_t) * bucket_size;
alloc_size[1] = sizeof(uint32_t) * bucket_size * block_size;
}
/* Look at 4 bytes at &data[ix & mask].
Compute a hash from these, and store the value of ix at that position. */
static BROTLI_INLINE void FN(Store)(
HashLongestMatch* BROTLI_RESTRICT self, const uint8_t* BROTLI_RESTRICT data,
const size_t mask, const size_t ix) {
uint16_t* BROTLI_RESTRICT num = self->num_;
uint32_t* BROTLI_RESTRICT buckets = self->buckets_;
const uint32_t key = FN(HashBytes)(&data[ix & mask], self->hash_shift_);
const size_t minor_ix = num[key] & self->block_mask_;
const size_t offset = minor_ix + (key << self->block_bits_);
++num[key];
buckets[offset] = (uint32_t)ix;
}
static BROTLI_INLINE void FN(StoreRange)(HashLongestMatch* BROTLI_RESTRICT self,
const uint8_t* BROTLI_RESTRICT data, const size_t mask,
const size_t ix_start, const size_t ix_end) {
size_t i;
for (i = ix_start; i < ix_end; ++i) {
FN(Store)(self, data, mask, i);
}
}
static BROTLI_INLINE void FN(StitchToPreviousBlock)(
HashLongestMatch* BROTLI_RESTRICT self,
size_t num_bytes, size_t position, const uint8_t* ringbuffer,
size_t ringbuffer_mask) {
if (num_bytes >= FN(HashTypeLength)() - 1 && position >= 3) {
/* Prepare the hashes for three last bytes of the last write.
These could not be calculated before, since they require knowledge
of both the previous and the current block. */
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 3);
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 2);
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 1);
}
}
static BROTLI_INLINE void FN(PrepareDistanceCache)(
HashLongestMatch* BROTLI_RESTRICT self,
int* BROTLI_RESTRICT distance_cache) {
PrepareDistanceCache(distance_cache, self->num_last_distances_to_check_);
}
/* Find a longest backward match of &data[cur_ix] up to the length of
max_length and stores the position cur_ix in the hash table.
REQUIRES: FN(PrepareDistanceCache) must be invoked for current distance cache
values; if this method is invoked repeatedly with the same distance
cache values, it is enough to invoke FN(PrepareDistanceCache) once.
Does not look for matches longer than max_length.
Does not look for matches further away than max_backward.
Writes the best match into |out|.
|out|->score is updated only if a better match is found. */
static BROTLI_INLINE void FN(FindLongestMatch)(
HashLongestMatch* BROTLI_RESTRICT self,
const BrotliEncoderDictionary* dictionary,
const uint8_t* BROTLI_RESTRICT data, const size_t ring_buffer_mask,
const int* BROTLI_RESTRICT distance_cache, const size_t cur_ix,
const size_t max_length, const size_t max_backward,
const size_t dictionary_distance, const size_t max_distance,
HasherSearchResult* BROTLI_RESTRICT out) {
uint16_t* BROTLI_RESTRICT num = self->num_;
uint32_t* BROTLI_RESTRICT buckets = self->buckets_;
const size_t cur_ix_masked = cur_ix & ring_buffer_mask;
/* Don't accept a short copy from far away. */
score_t min_score = out->score;
score_t best_score = out->score;
size_t best_len = out->len;
size_t i;
/* Precalculate the hash key and prefetch the bucket. */
const uint32_t key =
FN(HashBytes)(&data[cur_ix_masked], self->hash_shift_);
uint32_t* BROTLI_RESTRICT bucket = &buckets[key << self->block_bits_];
PREFETCH_L1(bucket);
if (self->block_bits_ > 4) PREFETCH_L1(bucket + 16);
out->len = 0;
out->len_code_delta = 0;
BROTLI_DCHECK(cur_ix_masked + max_length <= ring_buffer_mask);
/* Try last distance first. */
for (i = 0; i < (size_t)self->num_last_distances_to_check_; ++i) {
const size_t backward = (size_t)distance_cache[i];
size_t prev_ix = (size_t)(cur_ix - backward);
if (prev_ix >= cur_ix) {
continue;
}
if (BROTLI_PREDICT_FALSE(backward > max_backward)) {
continue;
}
prev_ix &= ring_buffer_mask;
if (cur_ix_masked + best_len > ring_buffer_mask) {
break;
}
if (prev_ix + best_len > ring_buffer_mask ||
data[cur_ix_masked + best_len] != data[prev_ix + best_len]) {
continue;
}
{
const size_t len = FindMatchLengthWithLimit(&data[prev_ix],
&data[cur_ix_masked],
max_length);
if (len >= 3 || (len == 2 && i < 2)) {
/* Comparing for >= 2 does not change the semantics, but just saves for
a few unnecessary binary logarithms in backward reference score,
since we are not interested in such short matches. */
score_t score = BackwardReferenceScoreUsingLastDistance(len);
if (best_score < score) {
if (i != 0) score -= BackwardReferencePenaltyUsingLastDistance(i);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->distance = backward;
out->score = best_score;
}
}
}
}
}
/* we require matches of len >4, so increase best_len to 3, so we can compare
* 4 bytes all the time. */
if (best_len < 3) {
best_len = 3;
}
{
const size_t down =
(num[key] > self->block_size_) ? (num[key] - self->block_size_) : 0u;
for (i = num[key]; i > down;) {
size_t prev_ix = bucket[--i & self->block_mask_];
const size_t backward = cur_ix - prev_ix;
if (BROTLI_PREDICT_FALSE(backward > max_backward)) {
break;
}
prev_ix &= ring_buffer_mask;
if (cur_ix_masked + best_len > ring_buffer_mask) {
break;
}
if (prev_ix + best_len > ring_buffer_mask ||
/* compare 4 bytes ending at best_len + 1 */
BrotliUnalignedRead32(&data[cur_ix_masked + best_len - 3]) !=
BrotliUnalignedRead32(&data[prev_ix + best_len - 3])) {
continue;
}
{
const size_t len = FindMatchLengthWithLimit(&data[prev_ix],
&data[cur_ix_masked],
max_length);
if (len >= 4) {
/* Comparing for >= 3 does not change the semantics, but just saves
for a few unnecessary binary logarithms in backward reference
score, since we are not interested in such short matches. */
score_t score = BackwardReferenceScore(len, backward);
if (best_score < score) {
best_score = score;
best_len = len;
out->len = best_len;
out->distance = backward;
out->score = best_score;
}
}
}
}
bucket[num[key] & self->block_mask_] = (uint32_t)cur_ix;
++num[key];
}
if (min_score == out->score) {
SearchInStaticDictionary(dictionary,
self->common_, &data[cur_ix_masked], max_length, dictionary_distance,
max_distance, out, BROTLI_FALSE);
}
}
#undef HashLongestMatch

View File

@@ -0,0 +1,270 @@
/* NOLINT(build/header_guard) */
/* Copyright 2010 Google Inc. All Rights Reserved.
Distributed under MIT license.
See file LICENSE for detail or copy at https://opensource.org/licenses/MIT
*/
/* template parameters: FN, BUCKET_BITS, BUCKET_SWEEP_BITS, HASH_LEN,
USE_DICTIONARY
*/
#define HashLongestMatchQuickly HASHER()
#define BUCKET_SIZE (1 << BUCKET_BITS)
#define BUCKET_MASK (BUCKET_SIZE - 1)
#define BUCKET_SWEEP (1 << BUCKET_SWEEP_BITS)
#define BUCKET_SWEEP_MASK ((BUCKET_SWEEP - 1) << 3)
static BROTLI_INLINE size_t FN(HashTypeLength)(void) { return 8; }
static BROTLI_INLINE size_t FN(StoreLookahead)(void) { return 8; }
/* HashBytes is the function that chooses the bucket to place
the address in. The HashLongestMatch and HashLongestMatchQuickly
classes have separate, different implementations of hashing. */
static uint32_t FN(HashBytes)(const uint8_t* data) {
const uint64_t h = ((BROTLI_UNALIGNED_LOAD64LE(data) << (64 - 8 * HASH_LEN)) *
kHashMul64);
/* The higher bits contain more mixture from the multiplication,
so we take our results from there. */
return (uint32_t)(h >> (64 - BUCKET_BITS));
}
/* A (forgetful) hash table to the data seen by the compressor, to
help create backward references to previous data.
This is a hash map of fixed size (BUCKET_SIZE). */
typedef struct HashLongestMatchQuickly {
/* Shortcuts. */
HasherCommon* common;
/* --- Dynamic size members --- */
uint32_t* buckets_; /* uint32_t[BUCKET_SIZE]; */
} HashLongestMatchQuickly;
static void FN(Initialize)(
HasherCommon* common, HashLongestMatchQuickly* BROTLI_RESTRICT self,
const BrotliEncoderParams* params) {
self->common = common;
BROTLI_UNUSED(params);
self->buckets_ = (uint32_t*)common->extra[0];
}
static void FN(Prepare)(
HashLongestMatchQuickly* BROTLI_RESTRICT self, BROTLI_BOOL one_shot,
size_t input_size, const uint8_t* BROTLI_RESTRICT data) {
uint32_t* BROTLI_RESTRICT buckets = self->buckets_;
/* Partial preparation is 100 times slower (per socket). */
size_t partial_prepare_threshold = BUCKET_SIZE >> 5;
if (one_shot && input_size <= partial_prepare_threshold) {
size_t i;
for (i = 0; i < input_size; ++i) {
const uint32_t key = FN(HashBytes)(&data[i]);
if (BUCKET_SWEEP == 1) {
buckets[key] = 0;
} else {
uint32_t j;
for (j = 0; j < BUCKET_SWEEP; ++j) {
buckets[(key + (j << 3)) & BUCKET_MASK] = 0;
}
}
}
} else {
/* It is not strictly necessary to fill this buffer here, but
not filling will make the results of the compression stochastic
(but correct). This is because random data would cause the
system to find accidentally good backward references here and there. */
memset(buckets, 0, sizeof(uint32_t) * BUCKET_SIZE);
}
}
static BROTLI_INLINE void FN(HashMemAllocInBytes)(
const BrotliEncoderParams* params, BROTLI_BOOL one_shot,
size_t input_size, size_t* alloc_size) {
BROTLI_UNUSED(params);
BROTLI_UNUSED(one_shot);
BROTLI_UNUSED(input_size);
alloc_size[0] = sizeof(uint32_t) * BUCKET_SIZE;
}
/* Look at 5 bytes at &data[ix & mask].
Compute a hash from these, and store the value somewhere within
[ix .. ix+3]. */
static BROTLI_INLINE void FN(Store)(
HashLongestMatchQuickly* BROTLI_RESTRICT self,
const uint8_t* BROTLI_RESTRICT data, const size_t mask, const size_t ix) {
const uint32_t key = FN(HashBytes)(&data[ix & mask]);
if (BUCKET_SWEEP == 1) {
self->buckets_[key] = (uint32_t)ix;
} else {
/* Wiggle the value with the bucket sweep range. */
const uint32_t off = ix & BUCKET_SWEEP_MASK;
self->buckets_[(key + off) & BUCKET_MASK] = (uint32_t)ix;
}
}
static BROTLI_INLINE void FN(StoreRange)(
HashLongestMatchQuickly* BROTLI_RESTRICT self,
const uint8_t* BROTLI_RESTRICT data, const size_t mask,
const size_t ix_start, const size_t ix_end) {
size_t i;
for (i = ix_start; i < ix_end; ++i) {
FN(Store)(self, data, mask, i);
}
}
static BROTLI_INLINE void FN(StitchToPreviousBlock)(
HashLongestMatchQuickly* BROTLI_RESTRICT self,
size_t num_bytes, size_t position,
const uint8_t* ringbuffer, size_t ringbuffer_mask) {
if (num_bytes >= FN(HashTypeLength)() - 1 && position >= 3) {
/* Prepare the hashes for three last bytes of the last write.
These could not be calculated before, since they require knowledge
of both the previous and the current block. */
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 3);
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 2);
FN(Store)(self, ringbuffer, ringbuffer_mask, position - 1);
}
}
static BROTLI_INLINE void FN(PrepareDistanceCache)(
HashLongestMatchQuickly* BROTLI_RESTRICT self,
int* BROTLI_RESTRICT distance_cache) {
BROTLI_UNUSED(self);
BROTLI_UNUSED(distance_cache);
}
/* Find a longest backward match of &data[cur_ix & ring_buffer_mask]
up to the length of max_length and stores the position cur_ix in the
hash table.
Does not look for matches longer than max_length.
Does not look for matches further away than max_backward.
Writes the best match into |out|.
|out|->score is updated only if a better match is found. */
static BROTLI_INLINE void FN(FindLongestMatch)(
HashLongestMatchQuickly* BROTLI_RESTRICT self,
const BrotliEncoderDictionary* dictionary,
const uint8_t* BROTLI_RESTRICT data,
const size_t ring_buffer_mask, const int* BROTLI_RESTRICT distance_cache,
const size_t cur_ix, const size_t max_length, const size_t max_backward,
const size_t dictionary_distance, const size_t max_distance,
HasherSearchResult* BROTLI_RESTRICT out) {
uint32_t* BROTLI_RESTRICT buckets = self->buckets_;
const size_t best_len_in = out->len;
const size_t cur_ix_masked = cur_ix & ring_buffer_mask;
/* TODO: compare 4 bytes at once (and set the minimum best len to 4) */
int compare_char = data[cur_ix_masked + best_len_in];
size_t key = FN(HashBytes)(&data[cur_ix_masked]);
size_t key_out;
score_t min_score = out->score;
score_t best_score = out->score;
size_t best_len = best_len_in;
size_t cached_backward = (size_t)distance_cache[0];
size_t prev_ix = cur_ix - cached_backward;
BROTLI_DCHECK(cur_ix_masked + max_length <= ring_buffer_mask);
out->len_code_delta = 0;
if (prev_ix < cur_ix) {
prev_ix &= (uint32_t)ring_buffer_mask;
if (compare_char == data[prev_ix + best_len]) {
const size_t len = FindMatchLengthWithLimit(
&data[prev_ix], &data[cur_ix_masked], max_length);
if (len >= 4) {
const score_t score = BackwardReferenceScoreUsingLastDistance(len);
if (best_score < score) {
out->len = len;
out->distance = cached_backward;
out->score = score;
if (BUCKET_SWEEP == 1) {
buckets[key] = (uint32_t)cur_ix;
return;
} else {
best_len = len;
best_score = score;
compare_char = data[cur_ix_masked + len];
}
}
}
}
}
if (BUCKET_SWEEP == 1) {
size_t backward;
size_t len;
/* Only one to look for, don't bother to prepare for a loop. */
prev_ix = buckets[key];
buckets[key] = (uint32_t)cur_ix;
backward = cur_ix - prev_ix;
prev_ix &= (uint32_t)ring_buffer_mask;
if (compare_char != data[prev_ix + best_len_in]) {
return;
}
if (BROTLI_PREDICT_FALSE(backward == 0 || backward > max_backward)) {
return;
}
len = FindMatchLengthWithLimit(&data[prev_ix],
&data[cur_ix_masked],
max_length);
if (len >= 4) {
const score_t score = BackwardReferenceScore(len, backward);
if (best_score < score) {
out->len = len;
out->distance = backward;
out->score = score;
return;
}
}
} else {
size_t keys[BUCKET_SWEEP];
size_t i;
for (i = 0; i < BUCKET_SWEEP; ++i) {
keys[i] = (key + (i << 3)) & BUCKET_MASK;
}
key_out = keys[(cur_ix & BUCKET_SWEEP_MASK) >> 3];
for (i = 0; i < BUCKET_SWEEP; ++i) {
size_t len;
size_t backward;
prev_ix = buckets[keys[i]];
backward = cur_ix - prev_ix;
prev_ix &= (uint32_t)ring_buffer_mask;
if (compare_char != data[prev_ix + best_len]) {
continue;
}
if (BROTLI_PREDICT_FALSE(backward == 0 || backward > max_backward)) {
continue;
}
len = FindMatchLengthWithLimit(&data[prev_ix],
&data[cur_ix_masked],
max_length);
if (len >= 4) {
const score_t score = BackwardReferenceScore(len, backward);
if (best_score < score) {
best_len = len;
out->len = len;
compare_char = data[cur_ix_masked + len];
best_score = score;
out->score = score;
out->distance = backward;
}
}
}
}
if (USE_DICTIONARY && min_score == out->score) {
SearchInStaticDictionary(dictionary,
self->common, &data[cur_ix_masked], max_length, dictionary_distance,
max_distance, out, BROTLI_TRUE);
}
if (BUCKET_SWEEP != 1) {
buckets[key_out] = (uint32_t)cur_ix;
}
}
#undef BUCKET_SWEEP_MASK
#undef BUCKET_SWEEP
#undef BUCKET_MASK
#undef BUCKET_SIZE
#undef HashLongestMatchQuickly

Some files were not shown because too many files have changed in this diff Show More