Zero Lee的专栏

Template and Inheritance

As Meyers noted in Item 24 of Effective C++,the inability to inline a virtual function is its biggest performace penalty.
Virtual functions seems to inflict a performace cost in several ways:
[1] The vptr must be initialized in the constructor.
[2] A virtual function is invoked via pointer indirection. We must fetch the pointer to the function table and then access the correct function offset.
[3] Inlining is a compile-time decision. The compiler cannot inline virtual functions whose resolution takes place at run-time.
The true cost of virtual functions then boils down to the third item only.

-------------------------------------------------------------------------------------------
Virtual function calls that can be resolved only at rum-time will inhibit inling. At times, that may pose a performace problem that we must solve. Dynamic binding of a function call is a consequence of inheritance. One way to eliminate dyanamic binding is to replace inheritance with a template-based design. Templates are more performance-friendly in the sense that they push the resolution step from run-time to compile-time. Compile-time, as far as we are concerned, is free.

The desing space for inheritance and templates has some overlap. We will discuss one such example.

Suppose you wanted to develop a thread-safe string class that may be manipulated safely by concurrent threads in a Win32 environment. In that environment you have a choice of multiple synchronization schemes such ascriticalsection, mutex, and semanphores, just to name a few. You would like your thread-safe string to offer the flexibility to use any of those schemes, and at different times you may have a reason to prefer one scheme over another. Inheritance would be a reasonable choice to capture the commonality among synchronization mechanisms.

The Locker abstract base class will declare the common interface:

 1 class  Locker
 2 {
 3 public :
 4     Locker()  { }
 5      virtual   ~ Locker()  { }
 6      virtual   void   lock ()  =   0 ;
 7      virtual   void  unlock()  =   0 ;
 8 }
;
 9
10 class  CriticalSectionLock :  public  Locker
11
12
13 }
;
14 class  MutexLock :  public  Locker
15 {
16  
17 }
;
Because you prefer not to re-invent the wheel, you made the choice to derive the thread-safe string from the existing standard string. The remaining design choices are:

[1] Hard coding. You could derive three distinct classes from string::CriticalSectionString, MutexString, and SemaphoreString, each class implementing its implied synchronization mechanism.
[2] Inheritance. You could derive a single ThreadSafeString class that contains a pointer to a Locker object. Use polynorphism to select the particular synchronization mechanism at run-time.
[3] Templates. Create a template-based string class parameterized by the Locker type.
////////////////////////////////////////////////////////////////////////////////////////////
Here we only talk about the Template implementation.

The templates-based design combines the best of both worlds-reuse and efficiency. The ThreadSafeString is implemented as a template parameterized by the Locker template argument:
 1template <class LOCKER>
 2class ThreadSafeString : public string
 3{
 4public:
 5   ThreadSafeString(const char* s) 
 6   : string(s) { }
 7   
 8   int length();
 9private:
10   LOCKER lock;
11}
;
12
The length method implementation is similar to the previous ones:
 1template <class LOCKER>
 2inline
 3int ThreadSafeString<LOCKER> :: length()
 4{
 5  lock.lock();
 6  int len = string::length();
 7  lock.unlock();
 8
 9  return len;
10}
If you want critical section protection, you will instantiate the template with a CriticalSectionLock:
{
   ThreadSafeString<CriticalSectionLock> csString = "Hello";
   ...
}
or you may go with a mutex:
{
   ThreadSafeString<MutexLock> mtxString = "Hello";
   ...
}

This design also provides a relief from the virtual function calls to lock() and unlock(). The declaration of a ThreadSafeString selects a particular type of synchronization upon template instantiation time. Just like hard coding, this enables the compiler to resolve the virtual calls and inline them.

As you can see, templates can make a positive performace contribution by pushing computations out of the excution-time and into compile-time, enabling inling in the process.

posted on 2006-11-13 13:37 Zero Lee 阅读(269) 评论(0)  编辑 收藏 引用 所属分类: C++ Performance


只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理