理解STL－－－understanding stl

Introduction

STL has a lot of powerful features which are undiscovered for many programmers. Complexity of templates stops people from discovering the scope of STL. Here, I am removing that complexity by explaining with template’s generated code with commonly used data types. This article intends to trigger you for exploring the internals of STL and use its powerful features. Once you understand the features of STL, you will get addicted of using it in your development. Explaining all the container and algorithm internals are out of the scope of this article. I choose vector and some sample algorithms to explain the usage of STL.

What is STL?

The Standard Template Library (STL) is a general-purpose C++ library of algorithms and data structures, originated by Alexander Stepanov and Meng Lee. The STL, based on a concept known as generic programming, is part of the standard ANSI C++ library. The STL is implemented by means of the C++ template mechanism, hence its name. While some aspects of the library are very complex, it can often be applied in a very straightforward way, facilitating reuse of the sophisticated data structures and algorithms it contains.

Why should I learn STL?

STL gives generic collections and algorithms which can be applied on those collections. STL code can be used across the platforms. Since STL containers and algorithms are C++ templates, you are able to use any data type, if it satisfies the requirement of the templates. You can make generic code which will accept all types of container classes. Examples of STL containers are deque, list, map, multimap, multiset, set, and vector; and examples of algorithms are find, copy, remove, max, sort, and accumulate.

Vector and Algorithm Relation

Let us try to have an idea of how algorithm ‘remove’ works on a vector. Vector is an STL container which grows dynamically and have similar features of an array.

#include <iostream>
#include <vector>
#include <deque>
#include <list>
#include <algorithm>
using namespace std;
void main()
{  
    typedef /*list/deque*/vector<int > intVect ;
    typedef intVect::iterator intVectIt ;
    intVectIt end ,it,last;
            
    intVect Numbers ;  
    Numbers.push_back ( 10 );Numbers.push_back ( 33 ); Numbers.push_back(50);
    Numbers.push_back(33);    
    cout << "Before calling remove" << endl ;
    end = Numbers.end();
    cout << "Numbers { " ;
    for( it = Numbers.begin(); it != end; it++)
        cout << *it << " " ;
    cout << " }\n" << endl ;
    last = remove(Numbers.begin(), end, 33) ;
    cout << "After calling remove" << endl ;
    cout << "Numbers { " ;
    for(it = Numbers.begin(); it != last; it++)
        cout << *it << " " ;
    cout << " }\n" << endl ;
}

Output of the program

Before calling remove
Numbers { 10 33 50 33  }
 
After calling remove
Numbers { 10 50  }

In the above example, we insert 10, 33, 50, and 33 into vector Numbers. Algorithm ‘remove’ is used to remove all the elements which has a value 33. ‘remove’ uses iterator of vector for this operation. iterators have similarities to pointers and are used to traverse an array of objects. In implementation of vector<int>, it's iterator is int*.

Now, change vector to deque or list in the first line. You can see our program is working exactly same as in the vector case! For changing the container implementation, we had to change only one line of code. This is one of the big benefits which is offered by STL. Now, take the above piece of code to some other operating system (I tested with Windows and Linux ), build and run it. Without any change, it works! This is another big advantage STL offers. You can try different data types instead of int for vector container.

Constructors of vector

Now, let us look at how default constructor and constructor with an integer argument work. I will be using object of the following class for inserting into the vector.

class A
{
    int iValue;
    A()
    {
        cout << "A()" << endl;
    }
};

Class vector<A> has four member variables: allocator, _First, _Last, and _End. allocator is of type std::allocator<A>, and _First, _Last, and _End are iterators which come out to be A*. Default constructor of vector<A> initializes allocator with std::allocator<A> object. _First, _Last, _End are initialized to null pointers.

Now, let us see how constructor vector(size_type _N, const _Ty& _V = _Ty(), const _A& _Al = _A()) works. By removing template variables for class A, prototype of the constructor becomes:

vector ( unsigned int _N , const A& _V = A(), 
  const std::allocator<A> & _A1 = std::allocator<A> () )

This constructor does the following things to initialize its member variables:

_First = allocator.allocate(_N, (void *)0);
_Ufill(_First, _N, _V);
_Last = _First + _N;
_End = _Last;

allocator.allocate calls template function _Allocate passing number of objects to be created (_N). For class A, this template function is generated to A* _Allocate ( int , A * ).

This function allocates _N * sizeof ( A ) memory and returns the starting address. This is assigned to _First. _Ufill does something like this:

for (; 0 < _N; --_N, ++_F)
    allocator.construct(_F, _X);

Here, _F ( of type A* ) points to the address where object to be constructed from _X through A ( const A & ).

_Ufill copies the one object constructed (const reference to object is _V. Refer line 2.) to all _N locations. allocator.construct calls template function _Consturct passing the same arguments. A new object is created at location _F from reference to object _X (new placement operator is used here). In line 3, _Last is assigned to address after last object. _End and _Last point to the same location.

Now, let us have a look at how the destructor works. Destructor of the vector does the following things:

_Destroy(_First, _Last);
allocator.deallocate(_First, _End - _First);
_First = 0, _Last = 0, _End = 0;

_Destroy calls allocator.destroy ( _F ) for each A* _F from _First through _Last. This function will call destructor ~A() explicitly to destroy the object (remember, we allocated using placement new operator). allocator.deallocate deletes the memory pointed to by (_First) which will free memory allocated for all the objects in the vector.

How push_back works?

Most complicated and most frequently used function in a vector is push_back. Let us analyze how push_back works? push_back will insert new data at the end of the vector. It calls insert ( end(), _X). (Member function end() returns _Last and _X is reference to the object (class A) to be added.) insert calls another overloaded function insert (_Last , 1 , _X).

For class A, this function definition becomes:

void insert(A* _P, unsigned int _M, const A& _X)
{
    if (_End - _Last < _M)
    {
        unsigned int _N = size() + (_M < size() ? size() : _M);
        A* _S = allocator.allocate(_N, (void *)0);
        A* _Q = _Ucopy(_First, _P, _S);
        _Ufill(_Q, _M, _X);
        _Ucopy(_P, _Last, _Q + _M);
        _Destroy(_First, _Last);
        allocator.deallocate(_First, _End - _First);
        _End = _S + _N;
        _Last = _S + size() + _M;
        _First = _S;
    }
    else if (_Last - _P < _M)
    {
        _Ucopy(_P, _Last, _P + _M);
        _Ufill(_Last, _M - (_Last - _P), _X);
        fill(_P, _Last, _X);
        _Last += _M; 
    }
    else if (0 < _M)
    {
        _Ucopy(_Last - _M, _Last, _Last);
        copy_backward(_P, _Last - _M, _Last);
        fill(_P, _P + _M, _X);
        _Last += _M; 
    }
}

Before explaining this code, we should look at the difference between _End and _Last. _End is the end of the buffer allocated to the vector, and _Last is the end of the last value inserted. To make it clear, take the case of pushing back object of A to a vector which currently contains three objects, and assume _End and _Last point to the same location. In this case, allocator will allocate memory for 3+3 objects even though we have to insert only one. This is to reduce reallocations. After push_back, _End will point to the end of 6th object, and _Last will point to the end of the 4th object. During push_back, insert checks whether reallocation is needed ((_End - _Last < _M)). If reallocation is needed, it allocates double size or size() + _M, whichever is higher. It copies all the data to the new buffer and removes the earlier buffer. It then inserts new data at the _Last position and reassigns _Last and _End. If we have enough space to insert a new element (i.e., _End - _Last > _M), new element is inserted at _Last, and _Last is reassigned.

STL Algorithms (how it works with a vector)

Before understanding algorithms, we will have to learn how function objects work. Class of a function object defines operator()(). The result is, a template function can not detect whether you passed a pointer to a function or object of a class having operator()(). Following example will give you a clear idea about what are function objects:

#include <iostream.h>
void  f ()
{
    cout << "f()" << endl;
}
 
class X
{
public:
    void operator()()
    {
        cout << "X::operator()" << endl;
    }
};
 
template < class T >
void test_func ( T f1 )
{
    f1();
}
void main()
{
    X a;
    test_func(f );
    test_func(a);
}

Output of the program is:

f()
X::operator()

Function objects can be classified as Generator (no argument), UnaryFunction (single argument), and BinaryFunction (takes two arguments). A special case of unary and binary functions are predicates (UnaryPredicate, BinaryPredicate) which simply means function returns a bool. STL has, in the header file <functional>, a set of templates that automatically create function objects for you. It is powerful not only because it’s a reasonably complete library of tools, but also because it provides a vocabulary for thinking about problem solutions, and because it is a framework for creating additional tools.

How copy works?

Problem: copy all the data of one vector<A> to another vector<A>.

Analysis: For class A, function definition becomes (you have to pass arguments which support operator ++ () and operator *(). You can apply * and ++ to iterators.):

copy ( A* first , A* last , A* x )

Function copy evaluates *(x+N) = * ( first + N ) for all N in the range [0, last-first]. It returns x+N. Consider vector<A> objects sourceA, DestA. For copying all data from sourceA to DestA, you have to call copy (sourceA.begin() , sourceA.end() , DestA.begin()).

Before calling copy, ensure that destination vector has enough space to accommodate all the data in source vector.

How transform works?

Problem: you have a vector<int> of three members and you want to store square of each element in another vector<int>.

Solution: following piece of code does the job for you:

int square_it ( const int x)
{
    return x*x;
}
void main ()
{
    vector < int > v1, v2(3);
    v1.push_back(2);
    v1.push_back(5);
    v1.push_back(83);
    transform ( v1.begin() , v1.end() , v2.begin() , square_it );
    for ( vector<int>::iterator it = v2.begin(); it != v2.end() ; it++ )
        cout << *it;
}

For the above code, function prototype generated for transform is:

int * trnsform ( int * First , int * Last , int* x, int (*f) (const int) )

transform evaluates *(x + N ) = square_it (*(First + N ) ) for all N in the range [0, Last - First]. Note: v2 has enough memory allocated (memory for three integers) before calling transform.

How generate works?

Problem: insert three ascending numbers in a vector without using push_back.

Solution: following piece of code does the job for you:

int f ()
{
    static int x = 1;
    return ++x;
}
void main ()
{
    vector < int > v2(3);
    generate ( v2.begin() , v2.end() , f );
    for ( vector<int>::iterator it = v2.begin(); it != v2.end() ; it++ )
        cout << *it;
}

For the above code, generated function prototype for ‘generate’ is:

void generate ( int* first , int* last , int (*f)() )

generate evaluates *(first + N ) = f ( ) for all N in the range [0 , last - first].

How replace_if works?

Problem: change all 0 to -1 in the vector<int>.

Solution: assume vector<int> v contains integers 1, 2, 0, 6, 0. Following code will change all the 0s to -1:

int f (const int x)
{
   if ( x ==  0 )
       return true;
   else
       return false;
}
replace_if( v.begin() , v.end() , f , -1);

Generated prototype for replace_if is:

void replace_if ( int * first , int * last , int (*f) ( const int) , -1 )

For all N in the range [0, last-first], replace_if evaluates to:

if ( f ( *(first+N) ) )
    *(first + N ) = -1;

How for_each works?

Problem: print all the members of vector.

Solution: following piece of code does the work for you:

void f (const int x)
{
    cout << x << endl;
}
for_each ( v.begin() , v.end() , f);

Prototype generated for the above code is:

void for_each ( int * first , int* last , void (*f) (const int ) );

for_each will evaluate to:

f ( *( first + N ) ) for all N in the range [0 , last - first ]

How fill works?

Problem: fill vector<int> v with value 10.

Solution: following code does the work for you:

fill ( v.begin() , v.end() , 10 );

Generated function prototype for the above invocation of fill is:

void fill(int * first, int * last, const int &) ;

fill evaluates *(first + N) = x once for each N in the range [0, last - first].

How count_if works?

Problem: count number of '2's in an integer array.

Solution: following code does the work for you:

bool f2 ( const int x )
{
    if ( x == 2 )
        return true;
    else
        return false;
}

Assume vector<int> v contains integers. Following code will return number of '2's in the vector:

int iNoof2s = count_if( v.begin() , v.end() , f2 );

Generated function prototype for the above code is:

unsigned int count_if (  int * first , int * last , bool (*fn) ( const int ) );

count_if sets a count n to zero. It then executes ++n for each N in the range [0, last - first] for which the predicate fn(*(first + N)) is true. It evaluates the predicate exactly last - first times.

Conclusion

You have got an idea about how vector and some of the algorithms work. Try using other containers and explore more algorithms. You will discover more interesting things. Happy coding with STL.

About Jais Joy

5+ year in software industry.Currently working for IBM Bangalore
Looking forward to contribute to software industry.

Click here to view Jais Joy's online profile.

posted on 2005-12-28 08:46 梦在天涯阅读(2390) 评论(2) 编辑收藏引用所属分类: STL/Boost

# re: 理解STL－－－understanding stl 2006-01-10 11:35 shanzy

CP网站上评分才1.98，可见文章的可看程度是多少？？

原文作者使用的是VC自带的STL，都知道这个可读性几乎为0

另外，文章的出处还没有注明：

http://www.codeproject.com/vcpp/stl/stl_by_code_walk.asp 回复更多评论

# re: 理解STL－－－understanding stl 2006-01-10 12:11 shanzy

总的来说，原文作者的文章还是不错，不过有的地方说的有点含糊

http://www.codeproject.com/vcpp/stl/stlintroduction.asp

This example declares a class Value, which stores a parameterized value, _value, of type T.

value不应该算：template parameter list中的1个，要算也要算是类的成员函数的参数列表中的一个回复更多评论

刷新评论列表

只有注册用户登录后才能发表评论。


相关文章: 引领boost（六）（boost::bind）理解STL－－－understanding stl list来操作stl中的算法－－－其他的都相似！ stl中的algorithms真是强！我能想到的算法都有啊！ alloc和一些数许函数详细解说 STL 排序(Sort) 函数对象 stl----sort 与unique的调用次序和unique的用法 stl－－－Container Types and Members in Detail stl－什么时候用那种container？

网站导航: 博客园 IT新闻 BlogJava 博问 Chat2DB 管理

C++ Programmer's Cookbook