无我

让内心永远燃烧着伟大的光明的精神之火!
灵活的思考,严谨的实现
豪迈的气魄、顽强的意志和周全的思考

【转】Thread Local Storage - The C++ Way

http://www.codeproject.com/Articles/8113/Thread-Local-Storage-The-C-Way

 

Introduction

Global data, while usually considered poor design, nevertheless often is a useful means to preserve state between related function calls. When it comes to using threads, the issue unfortuantely is complicated by the fact that some access synchronisation is needed, to avoid that more than one thread will modify the data.

There are times when you will want to have a globally visible object, while still having the data content accessible only to the calling thread, without holding off other threads that contend for the "same" global object. This is where thread local storage (TLS) comes in. TLS is something the operating system / threading subsystem provides, and by its very nature is rather low level.

From a globally visible object (in C++) you expect that its constructors are getting called before you enter "main", and that it is disposed properly, after you exit from "main". Consequently one would expect a thread local "global" object beeing constructed, when a thread starts up, and beeing destroyed when the thread exits. But this is not the case! Using the native API one can only have TLS that needs neither code to construct nor code to destruct.

While at first glance this is somewhat disappointing, there are reasons, not to automatically instantiate all these objects on every thread creation. A clean solution to this problem is presented e.g. in the "boost" library. Also the standard "pthread" C library addresses this problem properly. But when you need to use the native windows threading API, or need to write a library that, while making use of TLS, has no control over the threading API the client code is using, you are apparently lost.

Fortunately this is not true, and this is the topic of this article. The Windows Portable Executable (PE) format provides for support of TLS-Callbacks. Altough the documentation is hard to read, it can be done with current compilers i.e. MSVC 6.0,7.1,... Since noone else seemingly was using this feature before, and not even the C runtime library (CRT) is making use of it, you should be a little careful and watch out for undesired behaviour. Having said, that the CRT does not use it, does not mean it does not implement it. Unfortunately there is a small bug present in the MSVC 6.0 implementation, that is also worked-around by my code.

If it turns out, that the concepts, presented in this article, prove to be workable in "real life", I would be glad if this article has helped to remove some dust from this topic and make it usable for a broader range of applications. I could e.g. think of a generalized atexit_thread function that makes use of the concepts presented here.

Before going to explain the gory details, I want to mention Aaron W. LaFramboise who made me aware of the existence of the TLS-Callback mechanism.

Using the code

If you are using the precompiled binaries, you simply will need to copy the *.lib files to a convenient directory where your compiler usually will find libraries. So you will copy the files from the include directory to a directory where your compiler searches for includes. Alternatively you may simply copy the files to your project directory.

The following is a simple demonstration of usage, to get you started.

#include <process.h>
// first include the header file
#include <tls.h>

// this is your class
struct A {
    A() : n(42) {
    }
    ~A() {
    }
    int the_answer_is() {
        int m = n;
        n = 0;
        return m;
    }
int n;
};        

// now define a tls wrapper of class A
tls_ptr<A> pA;

// this is the threaded procedure
void  run(void*)
{
    // instantiate a new "A"
    pA.reset(new A);

    // access the tls-object    
    ans = pA->the_answer_is();

    // note, that we do not need to deallocate
    // the object. This is getting done automagically
    // when the thread exits.
}

int main(int argc, char* argv[])
{
    // the main thread also gets a local copy of the tls.
    pA.reset(new A);

    // start the thread
    _beginthread(&run, 0, 0);

    // call into the main threads version
    pA->the_answer_is();

    // the "run" thread should have ended when we
    // are exiting.
    Sleep(10000);
    
    // again we do not need to free our tls object.
    // this is comparable in behaviour to objects
    // at global scope.
    return 0;
}

While at first glance it might appear natural that the tls-objects should not be wrapped as pointers, in fact it is not. While the objects are globally visible, they are still "delegates" that forward to a thread local copy. The natural way in C++ to express delegation is a pointer object. (The technical reason of course is, that you cannot overload the "." operator but "->" can be overloaded.)

You can use this mechanism when building a "*.exe" file of course, but you also can use it when building a "*.dll" image. However when you are planning to load your DLL by LoadLibary() you should define the macro TLS_ALLOC when building your DLL. This is not necessary when using your DLL by means of an import library. A similar restriction applies when delay-loading your DLL. Please consult your compiler documentation when you are interested in the reasons for this. (Defining TLS_ALLOC forces the use of the TlsAlloc() family functions from the Win32 API.)

The complete API is kept very simple:

tls_ptr<A> pA;         // declare an object of class A
pA.reset(new A);       // create a tls of class A when needed
pA.reset(new A(45));   // create a tls of class A with a custom constructor
                       // note, that this also deletes any prior objects
                       // that might have been allocated to pA
pA.release();          // same as pA.reset(0), releases the thread local <BR>                       // object
A& refA = *pA;         // get a temporary reference to the contained object<BR>                       // for faster access
pA->the_answer_is();   // access the object 

Please again note, that it is not necessary to explicitely call the destructors of your class (or release()). This is very handy, when you are writing a piece of code, that has no control over the calling threads, but must still be multithread safe. One caveat however: The destructors of your class are called _after_ the CRT code has ended the thread. Consequently when you are doing something fancy in your destructors, which causes the CRT to reallocate its internal thread local storage pointers, you will be left with a small memory leak of the CRT. This is comparable in effect to the case when you are using the native Win32 API functions to create a thread, instead of _beginthread().

In principle that is all you need. But wait! I mentioned a small bug in the version 6 of the compiler. Luckily it is easy to work around. I provided an include file tlsfix.h which you will need to include into your program. You need to make sure it is getting included before windows.h. To be more precise: the TLS library must be searched before the default CRT library. So you alternatively may specify the library on the command line on the first place, and omit the inclusion of tlsfix.h.

Background

I will not discuss the user interface in this place. It suffices to say, that it essentialy is the same as in the boost library. However I omitted the feature of beeing able to specify arbitrary deleter functions, since this would have raised the need to include the boost library in my code. I wanted to keep it small and just demonstrate the principles. However, my implementation also deviates from boost insofar as I am featuring native compiler support for TLS variables, thus gaining an almost 4 times speed improvement. No need to say, that my implementation of course is Windows specific.

When thinking about TLS for C++ the main question is how to run the constructors and destructors. A careful study of the PE format (e.g. in the MSDN library) reveals, that it almost ever provided for TLS support. (Thanks again to Aaron W. LaFramboise who read it carefully enough.) Of special interest is the section about TLS-Callback:

The program can provide one or more TLS callback functions (though Microsoft 
compilers do not currently use this feature) to support additional 
initialization and termination for TLS data objects. A typical reason to use 
such a callback function would be to call constructors and destructors for 
objects.

Well it is true, that the compilers do not use the feature, but there is nothing that prevents user code to use it though. One somehow must convince the compiler (to be honest it is the linker) to place your callback in a manner, so the operating system will call it. It turns out, that this is surprisingly simple (omitting the deatils for a moment).

// declare your callback
void NTAPI on_tls_callback(PVOID h, DWORD dwReason, PVOID pv)
{
    if( DLL_THREAD_DETACH == dwReason )
        basic_tls::thread_term();
}

// put a pointer in a special segment
#pragma data_seg(".CRT$XLB")
PIMAGE_TLS_CALLBACK p_thread_callback = on_tls_callback;
#pragma data_seg()

You can even add more callbacks, by appending pointers to the ".CRT$XLB" segment. The fancy definitions are available from the windows.h and winnt.h include files in turn.

Now about the details: You will find at times, that your callbacks are not getting called. The reason for this is when the linker does not correctly wire up your segments. It turns out, that this coincides with when you are not using any __declspec(thread) in your code. A further study of the PE format description reveals:

The Microsoft run-time library facilitates this process by defining a memory image of the TLS Directory and giving it the special name “__tls_used” (Intel x86 platforms) or “_tls_used” (other platforms). The linker looks for this memory image and uses the data there to create the TLS Directory. Other compilers that support TLS and work with the Microsoft linker must use this same technique.

Consequentyly, when the linker does not find the _tls_used symbol it won't wire in your callbacks. Luckily this is easy to circumvent:

#pragma comment(linker, "/INCLUDE:__tls_used")

This will pull in the code from CRT that manages TLS. When using a version 7 compiler, that is all you need. (Actually I tried this with 7.1.) It turns out, however that using a version 6 compiler does not work. But the operating system cannot be the culprit, since code compiled by version 7 does work properly. After a little guess-work you will find out, that the CRT code from version 6 is slightly broken, because it inserts a wrong offset to the callback table. It is easy then to replace the errenous code and convince the linker to wire in the work around before the broken version from the CRT. You can study the tlsfix.c file from my submission, if you are interested in the details.

Points of Interest

Which is the first function of your program that is getting called by the operating system? Of course it is not main(). This was easy. Then mainCRTStartup specified as the entry-point in the linker comes to mind. Wrong again. Interestingly the first function beeing called is the Tls-Callback with Reason == DLL_PROCESS_ATTACH. But wait. Don't rely on this. This is not true on WinXP. I observed this on Win2000 only.

I did not yet try the code on Win95/98, WinXP-Home-Edition and Win2003. I would be interested on feedback about using this code on these platforms. In principle it should work, because it is a feature of PE and not the operating system, but ...

History

08.28.2004 Uploaded documentation, source and sample code.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Roland Schwarz



Austria Austria

Member

posted on 2012-06-27 17:27 Tim 阅读(653) 评论(0)  编辑 收藏 引用


只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理


<2012年6月>
272829303112
3456789
10111213141516
17181920212223
24252627282930
1234567

导航

统计

公告

本博客原创文章,欢迎转载和交流。不过请注明以下信息:
作者:TimWu
邮箱:timfly@yeah.net
来源:www.cppblog.com/Tim
感谢您对我的支持!

留言簿(9)

随笔分类(173)

IT

Life

搜索

积分与排名

最新随笔

最新评论

阅读排行榜