[MinGW-Notify] [mingw] #39174: Invalid code generated for extern thead_local variables

Back to archive index
MinGW Notification List mingw****@lists*****
Mon Apr 29 22:25:44 JST 2019

#39174: Invalid code generated for extern thead_local variables

  Open Date: 2019-04-29 11:23
Last Update: 2019-04-29 14:25

URL for this Ticket:
RSS feed for this Ticket:


Last Changes/Comment on this Ticket:
2019-04-29 14:25 Updated by: keith
 * Status Update from Open to Closed


I'm closing this, as invalid, for the following reasons:—

  • The code you illustrate is for a 64-bit host, so it most definitely was not
    generated by any compiler originating from this project.
  • GCC often uses lea instructions, as multi-byte nop filler code; that may,
    or may not be the intent here, but it's something to bear in mind.

In any case, since you are using tools originating from some other project,
(which, BTW, is abusing our registered trademark, without authorization), we
cannot help you.

Ticket Status:

      Reporter: vagran
         Owner: (None)
          Type: Issues
        Status: Closed
      Priority: 5 - Medium
     MileStone: (None)
     Component: (None)
      Severity: 5 - Medium
    Resolution: None

Ticket details:

Compiler produces some strange code when using C++11 global thread local
variables. The produced binary also permamently crashes when built in some
circumstances. The problem is visible on all optimization levels above -O0, the
binary compiled with -O0 looks good.

I use MSYS2 build environment with gcc 8.3.0 installed. Extract the attached
minimal example and run the following commands:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..

Then inspect the produced code with the GDB:

gdb sample.exe

(gdb) disas main
Dump of assembler code for function main():
   0x0000000000402d20 <+0>:     sub    $0x28,%rsp
   0x0000000000402d24 <+4>:     callq  0x401610 <__main>
   0x0000000000402d29 <+9>:     lea    -0x402d30(%rip),%rax        # 0x0
   0x0000000000402d30 <+16>:    test   %rax,%rax
   0x0000000000402d33 <+19>:    je     0x402d3a <main()+26>
   0x0000000000402d35 <+21>:    callq  0x0
   0x0000000000402d3a <+26>:    mov    0x160f(%rip),%rcx        # 0x404350 <.refptr.__emutls_v.tl>
   0x0000000000402d41 <+33>:    callq  0x402af0 <__emutls_get_address>
   0x0000000000402d46 <+38>:    movl   $0x2a,(%rax)
   0x0000000000402d4c <+44>:    xor    %eax,%eax
   0x0000000000402d4e <+46>:    add    $0x28,%rsp
   0x0000000000402d52 <+50>:    retq

The strange thing is instruction at <+9> "lea -0x703813bb(%rip),%rax" which
calculates some address, the result is tested against zero at <+16> and call to
this address is executed at <+21> in case it is non-zero. In this particular
example the result is zero, so it does not crash. But in my case the result was
not zero and still invalid which causes a crash. Unfortunately my example is
not easy reproducible, it probably was triggered after I statically linked my
DLL with large library (ffmpeg) and the calculated address became non-zero.
Besides that even in the attached example the idea behind this strange address
manipulation is not clear and looks like some code optimization bug.

   0x000000007771425d <+13>:    lea    -0x6f254264(%rip),%rax        # 0x84c0000
   0x0000000077714264 <+20>:    mov    %rcx,%rbx
   0x0000000077714267 <+23>:    mov    %rdx,%rdi
   0x000000007771426a <+26>:    mov    %r8,%rbp
   0x000000007771426d <+29>:    mov    %r9,%r12
   0x0000000077714270 <+32>:    test   %rax,%rax
   0x0000000077714273 <+35>:    je     0x7771427a <Func()+42>
   0x0000000077714275 <+37>:    callq  0x84c0000 // <<<<<<<< Crash here!
   0x000000007771427a <+42>:    mov    0x813e3f(%rip),%rcx        # 0x77f280c0 <.refptr.__emutls_v._ZN4java6jniEnvE>

Compare this code with one compiled with -O0:

cmake -DCMAKE_BUILD_TYPE=Debug ..

(gdb) disas main
Dump of assembler code for function main():
   0x0000000000401560 <+0>:     push   %rbp
   0x0000000000401561 <+1>:     mov    %rsp,%rbp
   0x0000000000401564 <+4>:     sub    $0x20,%rsp
   0x0000000000401568 <+8>:     callq  0x401640 <__main>
   0x000000000040156d <+13>:    callq  0x402d50 <_ZTW2tl> // "TLS wrapper function for tl"
   0x0000000000401572 <+18>:    movl   $0x2a,(%rax)
   0x0000000000401578 <+24>:    mov    $0x0,%eax
   0x000000000040157d <+29>:    add    $0x20,%rsp
   0x0000000000401581 <+33>:    pop    %rbp
   0x0000000000401582 <+34>:    retq

(gdb) disas _ZTW2tl
Dump of assembler code for function _ZTW2tl:
   0x0000000000402d50 <+0>:     push   %rbp
   0x0000000000402d51 <+1>:     mov    %rsp,%rbp
   0x0000000000402d54 <+4>:     sub    $0x20,%rsp
   0x0000000000402d58 <+8>:     mov    0x15a1(%rip),%rax        # 0x404300 <.refptr._ZTH2tl> //"TLS init function for tl"
   0x0000000000402d5f <+15>:    test   %rax,%rax
   0x0000000000402d62 <+18>:    je     0x402d69 <_ZTW2tl+25>
   0x0000000000402d64 <+20>:    callq  0x0
   0x0000000000402d69 <+25>:    mov    0x15e0(%rip),%rcx        # 0x404350 <.refptr.__emutls_v.tl>
   0x0000000000402d70 <+32>:    callq  0x402b20 <__emutls_get_address>
   0x0000000000402d75 <+37>:    add    $0x20,%rsp
   0x0000000000402d79 <+41>:    pop    %rbp
   0x0000000000402d7a <+42>:    retq

It is seen that TLS wrapper is not inlined in this case and instead of
calculating some address, some real variable read is generated at <+8> which
looks more correct. The calling address at <+20> is probably replaced with a
correct one when runtime-linked to some process context. In my case it looks
like this:

(gdb) disas _ZTWN4java6jniEnvE
Dump of assembler code for function _ZTWN4java6jniEnvE:
   0x000000005c4aa790 <+0>:     push   %rbp
   0x000000005c4aa791 <+1>:     mov    %rsp,%rbp
   0x000000005c4aa794 <+4>:     sub    $0x20,%rsp
   0x000000005c4aa798 <+8>:     mov    0xa1cf1(%rip),%rax        # 0x5c54c490 <.refptr._ZTHN4java6jniEnvE>
   0x000000005c4aa79f <+15>:    test   %rax,%rax
   0x000000005c4aa7a2 <+18>:    je     0x5c4aa7a9 <_ZTWN4java6jniEnvE+25>
   0x000000005c4aa7a4 <+20>:    callq  0xffffffffec9a0000
   0x000000005c4aa7a9 <+25>:    mov    0xa1ff0(%rip),%rcx        # 0x5c54c7a0 <.refptr.__emutls_v._ZN4java6jniEnvE>
   0x000000005c4aa7b0 <+32>:    callq  0x5c1a0268 <__emutls_get_address>
   0x000000005c4aa7b5 <+37>:    add    $0x20,%rsp
   0x000000005c4aa7b9 <+41>:    pop    %rbp
   0x000000005c4aa7ba <+42>:    retq

And it does not crash and works as expected, and crashing if compiled with -O1,
-O2 or -O3. It would be nice to know what is the idea behind the address
calculation instead of variable reading in debug build, which also
significantly changes application logic between release and debug build which
is very suspicious. For now, it looks like some optimization bug.

gcc -v
Using built-in specs.
Target: x86_64-w64-mingw32
Configured with: ../gcc-8.3.0/configure --prefix=/mingw64 --with-local-prefix=/mingw64/local --build=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --with-native-system-header-dir=/mingw64/x86_64-w64-mingw32/include --libexecdir=/mingw64/lib --enable-bootstrap --with-arch=x86-64 --with-tune=generic --enable-languages=ada,c,lto,c++,objc,obj-c++,fortran --enable-shared --enable-static --enable-libatomic --enable-threads=posix --enable-graphite --enable-fully-dynamic-string --enable-libstdcxx-filesystem-ts=yes --enable-libstdcxx-time=yes --disable-libstdcxx-pch --disable-libstdcxx-debug --disable-isl-version-check --enable-lto --enable-libgomp --disable-multilib --enable-checking=release --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-libiconv --with-system-zlib --with-gmp=/mingw64 --with-mpfr=/mingw64 --with-mpc=/mingw64 --with-isl=/mingw64 --with-pkgversion='Rev2, Built by MSYS2 project' --with-bugurl=https://sourceforge.net/projects/msys2 --with-gnu-as --with-gnu-ld
Thread model: posix
gcc version 8.3.0 (Rev2, Built by MSYS2 project)

uname -a
MSYS_NT-6.1 artyom-VM 2.11.2(0.329/5/3) 2018-11-26 09:22 x86_64 Msys

I also tried to compile the sample with clang, it failed to link:

[100%] Linking CXX executable sample.exe
/usr/bin/cmake.exe -E cmake_link_script CMakeFiles/sample.dir/link.txt --verbose=1
/mingw64/bin/clang++.exe  -g -O3 -DNDEBUG  -Wl,--enable-auto-import CMakeFiles/sample.dir/main.cpp.o CMakeFiles/sample.dir/tls.cpp.o  -o sample.exe -Wl,--out-implib,libsample.dll.a -Wl,--major-image-version,0,--minor-image-version,0
CMakeFiles/sample.dir/tls.cpp.o:(.text+0x0): multiple definition of `TLS wrapper function for tl'
CMakeFiles/sample.dir/main.cpp.o:(.text+0x50): first defined here
clang++.exe: error: linker command failed with exit code 1 (use -v to see invocation)

$ clang++ -v
clang version 7.0.1 (tags/RELEASE_701/final)
Target: x86_64-w64-windows-gnu
Thread model: posix
InstalledDir: C:\msys\mingw64\bin

Ticket information of MinGW - Minimalist GNU for Windows project
MinGW - Minimalist GNU for Windows Project is hosted on OSDN

Project URL: https://osdn.net/projects/mingw/
OSDN: https://osdn.net

URL for this Ticket:
RSS feed for this Ticket:

More information about the MinGW-Notify mailing list
Back to archive index