Make til::color's COLORREF conversion more optimal (#7619)

Clang (10) has no trouble optimizing the COLORREF conversion operator to
a simple 32-bit load with mask (!) even though it's a series of bit
shifts across multiple struct members.

MSVC (19.24) doesn't make the same optimization decision, and it emits
three 8-bit loads and some shifting.

In any case, the optimization only applies at -O2 (clang) and above.

In this commit, we leverage the spec-legality of using unions for type
conversions and the overlap of four uint8_ts and a uint32_t to make the
conversion very obvious to both compilers.

x86_64 msvc | O0 | O1 | O2
------------|----|----|--------------------
shifts      | 12 | 11 | 11 (fully inlined)
union       |  5 |  1 |  1 (fully inlined)

x86_64 clang | O0 | O1 | O2 + O3
-------------|----|----|--------------------
shifts       | 14 |  5 |  1 (fully inlined)
union        |  9 |  3 |  1 (fully inlined)

j4james brought up some concerns about til::color's minor wastefulness
in https://github.com/microsoft/terminal/pull/7578#discussion_r487355989.

This is a clear, simple transformation that saves us a few instructions
in a relatively common case, so I'm accepting a micro-optimization even
though we don't have data showing this to be a hot spot.
This commit is contained in:
Dustin L. Howett
2020-09-14 11:51:03 -07:00
committed by GitHub
parent 88d1527985
commit c17f448d73
2 changed files with 27 additions and 7 deletions

View File

@@ -6,6 +6,7 @@ ABANDONFONT
ABCDEFGHIJKLMNO
ABCG
abf
abgr
abi
ACCESSTOKEN
acec