mirror of
https://github.com/bitwiseworks/gcc-os2.git
synced 2026-02-13 21:54:40 +00:00
Support C __aligned__ and C++ alignas attributes #13
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @dmik on GitHub (Jan 16, 2020).
There is a GCC
__aligned__attribute and its C++11alignascounterpart. In short, they allow to specify the alignment (in bytes) for a variable or struct. This thing is basically ignored in our port of GCC now. I.e. given the followingaligned.c:the command
gcc -S aligned.cwill generate the following assembly:If we feed it to GCC under Linux, we will get this:
You may notice that the
.commdirective, along with the second argument which is the variable size, also gets a third argument which specifies its desired alignment from thealignedattribute.When such an assembly is then linked on OS/2, variables remain unaligned (or, to be exact, aligned at some default alignment which doesn't match the request). As a result, some applications that require strict alignment (e.g. because they use SSE2 instructions that require 128 byte alignment) break.
The reason for that is that the object format used on OS/2 is
a.out. Andgassupports alignment only forelfandPEformats — simply becausea.outis just very old and doesn't support per-symbol alignment specification. So generation of the third.commoption is disabled for it.Besides, there is also a similar problem in the OMF object file format which all
a.outfiles need to be converted to (byemxomf.exe) before they can be linked into an OS/2 (LX) executable. OMF only supports per-segment alignment which may be word (2 bytes), dword (4 bytes), para (16 bytes) and 64 KBytes. Such per-segment alignment doesn't allow to satisfy all possible alignment requests (especially those which are greater than 16 bytes) in an effective way. It could do so, if using 64K alignment for all segments, but this would be a big waste of program address space, especially in case of a large amount of translation units (object files). So it's not practically possible.The only solution here is to use something else instead of
a.outfor object files. For example, theelfformat. And then link it with some custom linker to an OS/2 executable. There are rumors thatwlinkfrom OpenWatcom (which we already use as a main, and the only supported linker for GCC on OS/2) can linkelfobject files into LX executables but this needs checking.@dmik commented on GitHub (Jan 16, 2020):
Note that for C++ the assembly is slightly different (it involves the usage of
.spaceand.balignassembler directives) but it ends up similarly: neithera.outnoromfsupport alignment on a level that allows to preserve it in the final executable.@dmik commented on GitHub (Jan 18, 2020):
Just to stress it out, one of key motivations to have this fixed is SSE support. Currently, we have to disable SSE (or lower the compiler optimization level) in some cases as it generates MMX commands requiring 128 bit alignment of memory variables. This is one case: https://github.com/bitwiseworks/libc/issues/30#issuecomment-465235535. And disabling SSE/MMX obviously degrades performance.
Note that GCC 9 now has
-mstackrealigndefault on OS/2 (4c1b7b3a1a) but it only solves alignment problems for stack-based variables. However, there are cases when variables MMX operates on are located in the data segment (i.e. global/static). And such variables require this ticket to be fixed to get it working.@komh commented on GitHub (Jan 22, 2020):
Per-segment alignment is one of 1(byte), 2(word), 4(dword), 16(para) and 256(page, maybe 4k, but it was not possible). 64k is not in them.
And .balign of gas worked up to 16 bytes alignment.
@dmik commented on GitHub (Jan 22, 2020):
@komh thanks! Any reference to where you got it from? I was just repeating somebody else's words, but I wonder if there is any official OMF specs. If 256 bytes alignment is really possible in OMF then it might be a solution, at least a temporary one. But the ELF feature of WLINK should be evaluated too. If it's there (or may be brought with some little effort), then making GCC produce ELF objects on OS/2 is also not a big deal.
@dmik commented on GitHub (Jan 23, 2020):
I've been provided the OMF specs in PDF, attaching it here for reference: omf.pdf.
As the document states, the SEGDEF record supports 1, 2, 4, 16 and page alignment. The latter on x386+ is always 4K. Anyway, 4K is also way too much for alignments >16 bytes. It will cause too much memory waste. So not an option. ELF seems like the only one.
@dryeo commented on GitHub (Jan 23, 2020):
The NASM docs (section 7.4) say this about ALIGN, though talking about
segments but these extensions may be supported for regular ALIGN,
@komh commented on GitHub (Jan 24, 2020):
@dmik I've seen it in 'Object Module Format Reference', 'ALP Programming Guide and Reference' of OS/2 Toolkit 4.5 and 'NASM manual'.
As you said, OMF spec says that segment is aligned on 4K byte boundary on 32bits platforms. I also thought so. But it is aligned differently according to linkers.
wl.exe: 256
link386.exe: 4096
ilink.exe: 4096
That is, WATCOM linker aligns on 256 byte boundary, but IBM linkers align on 4K byte boundary.
At first, I tested WATCOM linker only.
FYI, OS/2 ld aligns on 16 byte boundary all the time. It's possible to change the value because we have the sources. ^^
Nevertheless, if ELF is supported, it would be best.
@dmik commented on GitHub (Apr 29, 2020):
Just for the record, if I get it right, we also need this task to be done to support things like AVX in FFMPEG (to not crash on OS4 kernels). See the above Chromium comment.
@dmik commented on GitHub (Jan 5, 2021):
Also note http://trac.netlabs.org/ports/ticket/206. There is a link to an article about why
-fno-commonGCC option may help here. Needs checking.@dmik commented on GitHub (Jan 13, 2021):
BTW, there is a suggestion in http://trac.netlabs.org/ports/ticket/206 to use
-fno-commonto overcome alignment limitations wrt AVX and a claim it helps with FFmpeg. It might be the case there but it doesn't seem to help with the test case from the description of this issue. It appears that-fno-commonsimply disables grouping the global variables in COMM segment. Note also that when-fno-commonis used, the assembly is identical to C++ (which therefore doesn't seem to use COMM by default).This is the assembly with
-fno-common(to contrast with-fcommonassembly above):This is how the DATA group's object looks like in C mode and no special options (i.e.
-fcommonis assumed on our platform):This is how it looks in C mode with
-fno-commonand in C++:As one may see,
-fno-commonmakes alignment work much better (there are proper gaps between variables according to their size alignment) but the fact that data sections from different object files (i.e.crt0.obj+aligned.obj) are glued up together without any space and alignment, we end up with wrong alignment. The first offset inaligned.objis190instead of...80as expected by GCC (because the maximum requested alignment for this object file is 128 which is 80 in hex). So all alignment gets a 16 (10 in hex) bytes shift. Having it subtracted, we will get:which would be the exact requested (correct) alignment.
In case of
-fcommon, alignment is totally ignored it seems. But as this GCC bug shows, this option is mainly there for backward compatibility with ancient systems and in GCC 10 they made-fno-commonthe default (which is better both in terms of functionality and performance). Once we update to it we will get it "for free".Here is GCC docs for
-fcommon, for reference.So what we really need to fix here is the linker it seems (to obey maximum object file alignment and align it accordingly when gluing object files together). I guess that fixing EMXOMF and WL to do so is not a big issue per se but it might make the results not compatible with other linkers and OMF files generated by other compilers.
@dmik commented on GitHub (Jan 13, 2021):
BTW, OMF format seems to fully support
-fcommonsemantics viaCOMDEFrecords. This is what I get for it in the object file (usedlistomf):And this is what
-fno-commonproduces:As one may see, PUBDEF is much more accurate because it supports the offset field that can express both size and alignment.
However, EMXOMF sets the BSS segment alignment to 16 bytes (PARA) and this does not fit the 64 and 128 bytes alignment requirements when individual BSS segments are glued together. Using the page alignment here would satisfy it bug as mentioned above this would waste too many bytes when aligning. Perhaps we could effectively regroup PUBDEF records to minimize this waste (using different SEGDEF with different names and alignments) but this needs some thinking and it's a question if WL will support that well.
@dmik commented on GitHub (Jan 13, 2021):
Also, it's clear now why
-fno-commonmay help with FFMPEG/AVX. Given that it makes C/C++ alignment attributes respected and given that EMXOMF always uses 16-byte alignment for public variables in this mode, it is guaranteed that requested 128-bit (i.e. 16 byte) alignment will be provided. So, if AVX in FFMPEG doesn't use instructions using 256-bit (32 byte) alignment (which even-fno-commoncannot guarantee because of EMXOMF limitations), this will work indeed.