C++ Programmer's Cookbook

{C++ 基础} {C++ 高级} {C#界面,C++核心算法} {设计模式} {C#基础}

STL----string

今天看stl中的string得:

一  问题:

举例来说,如果文本格式是:用户名 电话号码,文件名name.txt

Tom 23245332
Jenny 22231231
Heny 22183942
Tom 23245332
...
现在我们需要对用户名排序,且只输出不同的姓名。
-------------------------------

如果使用C/C++ 就麻烦了,他需要做以下工作:

  1. 先打开文件,检测文件是否打开,如果失败,则退出。
  2. 声明一个足够大得二维字符数组或者一个字符指针数组
  3. 读入一行到字符空间
  4. 然后分析一行的结构,找到空格,存入字符数组中。
  5. 关闭文件
  6. 写一个排序函数,或者使用写一个比较函数,使用qsort排序
  7. 遍历数组,比较是否有相同的,如果有,则要删除,copy...
  8. 输出信息

-------------------------------------
我们可以使用 fstream来代替麻烦的fopen fread fclose, 用vector 来代替数组。最重要的是用 string来代替char * 数组,使用sort排序算法来排序,用unique 函数来去重。听起来好像很不错 smile 。看看下面代码(例程1):

#include <string>
#include <iostream>
#include <algorithm>
#include <vector>
#include <fstream>
using namespace std;
int main(){
        ifstream in("name.txt");
        string strtmp;
        vector<string> vect;
        while(getline(in, strtmp, '\n'))
        vect.push_back(strtmp.substr(0, strtmp.find(' ')));
        sort(vect.begin(), vect.end());
        vector<string>::iterator it=unique(vect.begin(), vect.end());
        copy(vect.begin(), it, ostream_iterator<string>(cout, "\n"));
        return 0;
}
--------------------------------------------





string 函数列表
函数名 描述
begin 得到指向字符串开头的Iterator
end 得到指向字符串结尾的Iterator
rbegin 得到指向反向字符串开头的Iterator
rend 得到指向反向字符串结尾的Iterator
size 得到字符串的大小
length 和size函数功能相同
max_size 字符串可能的最大大小
capacity 在不重新分配内存的情况下,字符串可能的大小
empty 判断是否为空
operator[] 取第几个元素,相当于数组
c_str 取得C风格的const char* 字符串
data 取得字符串内容地址
operator= 赋值操作符
reserve 预留空间
swap 交换函数
insert 插入字符
append 追加字符
push_back 追加字符
operator+= += 操作符
erase 删除字符串
clear 清空字符容器中所有内容
resize 重新分配空间
assign 和赋值操作符一样
replace 替代
copy 字符串到空间
find 查找
rfind 反向查找
find_first_of 查找包含子串中的任何字符,返回第一个位置
find_first_not_of 查找不包含子串中的任何字符,返回第一个位置
find_last_of 查找包含子串中的任何字符,返回最后一个位置
find_last_not_of 查找不包含子串中的任何字符,返回最后一个位置
substr 得到字串
compare 比较字符串
operator+ 字符串链接
operator== 判断是否相等
operator!= 判断是否不等于
operator< 判断是否小于
operator>> 从输入流中读入字符串
operator<< 字符串写入输出流
getline 从输入流中读入一行


------------------------------------
三 string提供了三个函数满足其要求:
const charT* c_str() const 
const charT* data() const 
size_type copy(charT* buf, size_type n, size_type pos = 0) const 
其中: 
  1. c_str 直接返回一个以\0结尾的字符串。
  2. data 直接以数组方式返回string的内容,其大小为size()的返回值,结尾并没有\0字符。
  3. copy 把string的内容拷贝到buf空间中。

---------------------
四    basic_string 是基于字符序列容器(Sequence)的模板类, 包含了说有序列容器的常用操作,同时也包含了字符串的标准操作,如"查找"和"合并" 。
typedef basic_string <char> string;
typedef basic_string<wchar_t> wstring;

五  Extended STL string

ext_string,提供一些常用的功能,例如:

  1. 定义分隔符。给定分隔符,把string分为几个字段。
  2. 提供替换功能。例如,用winter, 替换字符串中的wende
  3. 大小写处理。例如,忽略大小写比较,转换等
  4. 整形转换。例如把"123"字符串转换为123数字。

附录:ext_string:


/**
 * @mainpage Extended STL string
 * @author Keenan Tims - ktims@gotroot.ca
 * @version 0.2
 * @date 2005-04-17
 * @section desc Description
 * ext_string aims to provide a portable, bug-free implementation of many useful extensions to the
 * standard STL string class.  These extensions are commonly available among higher-level languages
 * such as Perl and Python, but C++ programmers are generally left on their own when it comes to
 * basic string processing.  By extending the STL's string, we can provide a drop-in replacement for STL
 * strings with the greater functionality of higher-level languages.
 *
 * The primary goal of this library is to make the STL string class more usable to programmers that
 * are doing simple string manipulation on a small scale.  Due to the usability goals of this class,
 * many actions will be inefficiently implemented for the sake of ease of use.  Some of this is
 * mitigated somewhat by doing modification in-place, however many unnecessary copies of data are
 * created by some methods, and the vector-returning methods are inefficient in that they copy the
 * substrings into the vector, then return a copy of the vector.  This would be much more efficient
 * as an iterator model.
 *
 *
 * @section feat Features
 *
 * @li Fully based on the STL, ext_string provides a superset of std::string methods
 * @li String splitting (tokenizing), on a character, a string, or whitespace
 * @li Replacement of substrings or characters with another string or character
 * @li String case operations (check, adjust)
 * @li Integer conversion
 * @li Fully open-source under a BSD-like license for use in any product
 *
 * @if web
 * @section download Downloads
 *
 * Downloads are provided in tar.gz and zip formats containing this documentation, the header file,
 * and the library's changelog.  The latest version of ext_string is 0.2, released on April 17,
 * 2005.
 *
 * @li<a href="files/ext_string-0.2.tar.gz">ext_string-0.2.tar.gz</a>
 * @li<a href="files/ext_string-0.2.zip">ext_string-0.2.zip</a>
 * @li <small><a href="files/ext_string-0.1.tar.gz">ext_string-0.1.tar.gz</a></small>
 * @li <small><a href="files/ext_string-0.1.zip">ext_string-0.1.zip</a></small>
 *
 * @section changelog Changelog
 *
 * The changelog is viewable online <a href="files/CHANGELOG">here</a>
 *
 * @endif
 *
 * @section notes Notes/Limitations
 * @li Copying all the substrings into a vector for the substring methods is pretty inefficient,
 * both for space and time.  It would be more prudent to model an iterator to split the string based
 * on the specified parameters, but this is more difficult to implement and more cumbersome to use.
 * Performance is not the main goal of this library, usability is, thus the tradeoff is deemed to be
 * acceptable.
 * @li References are not used too aptly in this class.  Some performance tuning could be done to
 * minimize unnecessary data copying.
 * @li The basic methods of std::string aren't overridden by this class, thus assigning the return
 * value of eg. string::insert() to an ext_string instance will make an unnecessary string object
 * which is then copy-constructed to an ext_string (I believe, internal workings of inheritance and
 * polymorphism in C++ are somewhat beyond my experience).  These methods should be wrapped by
 * ext_string to return the proper type.
 *
 *
 * @section related Related Documentation
 *
 * @li SGI's STL string reference: http://www.sgi.com/tech/stl/basic_string.html
 *
 *
 * @section license License
 *
 * Copyright (c) 2005, Keenan Tims
 * All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without modification, are permitted
 * provided that the following conditions are met:
 * @li Redistributions of source code must retain the above copyright notice, this list of
 * conditions and the following disclaimer.
 * @li Redistributions in binary form must reproduce the above copyright notice, this list of
 * conditions and the following disclaimer in the documentation and/or other materials provided with
 * the distribution.
 * @li Neither the name of the Extended STL String project nor the names of its contributors may be used to endorse
 * or promote products derived from this software without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTIBILITY AND
 * FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT OWNER OR
 * CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 * DATA OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
 * IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
 * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 *
 *
 */

#ifndef _EXT_STRING_H
#define _EXT_STRING_H

#include <string>
#include <vector>

namespace std
{

 /**
  * An extension of STL's string providing additional functionality that is often availiable in
  * higher-level languages such as Python.
  */
 class ext_string : public string
 {
  public:
   /**
    * Default constructor
    *
    * Constructs an empty ext_string ("")
    */
   ext_string() : string() { }

   /**
    * Duplicate the STL string copy constructor
    *
    * @param[in] s   The string to copy
    * @param[in] pos The starting position in the string to copy from
    * @param[in] n   The number of characters to copy
    */
   ext_string(const string &s, size_type pos = 0, size_type n = npos) : string(s, pos, npos) { }

   /**
    * Construct an ext_string from a null-terminated character array
    *
    * @param[in] s The character array to copy into the new string
    */
   ext_string(const value_type *s) : string(s) { }

   /**
    * Construct an ext_string from a character array and a length
    *
    * @param[in] s The character array to copy into the new string
    * @param[in] n The number of characters to copy
    */
   ext_string(const value_type *s, size_type n) : string(s, n) { }

   /**
    * Create an ext_string with @p n copies of @p c
    *
    * @param[in] n The number of copies
    * @param[in] c The character to copy @p n times
    */
   ext_string(size_type n, value_type c) : string(n, c) { }

   /**
    * Create a string from a range
    *
    * @param[in] first The first element to copy in
    * @param[in] last  The last element to copy in
    */
   template <class InputIterator>
    ext_string(InputIterator first, InputIterator last) : string(first, last) { }

   /**
    * The destructor
    */
   ~ext_string() { }

   /**
    * Split a string by whitespace
    *
    * @return A vector of strings, each of which is a substring of the string
    */
   vector<ext_string> split(size_type limit = npos) const
   {
    vector<ext_string> v;

    const_iterator
     i = begin(),
       last = i;
    for (; i != end(); i++)
    {
     if (*i == ' ' || *i == '\n' || *i == '\t' || *i == '\r')
     {
      if (i + 1 != end() && (i[1] == ' ' || i[1] == '\n' || i[1] == '\t' || i[1] == '\r'))
       continue;
      v.push_back(ext_string(last, i));
      last = i + 1;
      if (v.size() >= limit - 1)
      {
       v.push_back(ext_string(last, end()));
       return v;
      }
     }
    }

    if (last != i)
     v.push_back(ext_string(last, i));

    return v;
   }

   /**
    * Split a string by a character
    *
    * Returns a vector of ext_strings, each of which is a substring of the string formed by splitting
    * it on boundaries formed by the character @p separator.  If @p limit is set, the returned vector
    * will contain a maximum of @p limit elements with the last element containing the rest of
    * the string.
    *
    * If @p separator is not found in the string, a single element will be returned in the vector
    * containing the entire string.
    *
    * The separators are removed from the output
    *
    * @param[in] separator The character separator to split the string on
    * @param[in] limit     The maximum number of output elements
    * @return A vector of strings, each of which is a substring of the string
    *
    * @section split_ex Example
    * @code
    * std::ext_string s("This|is|a|test.");
    * std::vector<std::ext_string> v = s.split('|');
    * std::copy(v.begin(), v.end(), std::ostream_iterator<std::ext_string>(std::cout, "\n"));
    *
    * This
    * is
    * a
    * test.
    * @endcode
    */
   vector<ext_string> split(value_type separator, size_type limit = npos) const
   {
    vector<ext_string> v;

    const_iterator
     i = begin(),
     last = i;
    for (; i != end(); i++)
    {
     if (*i == separator)
     {
      v.push_back(ext_string(last, i));
      last = i + 1;
      if (v.size() >= limit - 1)
      {
       v.push_back(ext_string(last, end()));
       return v;
      }
     }
    }

    if (last != i)
     v.push_back(ext_string(last, i));

    return v;
   }

   /**
    * Split a string by another string
    *
    * Returns a vector of ext_strings, each of which is a substring of the string formed by
    * splitting it on boundaries formed by the string @p separator.  If @p limit is set, the
    * returned vector will contain a maximum of @p limit elements with the last element
    * containing the rest of the string.
    *
    * If @p separator is not found in the string, a single element will be returned in the
    * vector containing the entire string.
    *
    * The separators are removed from the output
    *
    * @param[in] separator The string separator to split the string on
    * @param[in] limit     The maximum number of output elements
    * @return A vector of strings, each of which is a substring of the string
    *
    * @ref split_ex
    */
   vector<ext_string> split(const string &separator, size_type limit = npos) const
   {
    vector<ext_string> v;

    const_iterator
     i = begin(),
     last = i;
    for (; i != end(); i++)
    {
     if (string(i, i + separator.length()) == separator)
     {
      v.push_back(ext_string(last, i));
      last = i + separator.length();

      if (v.size() >= limit - 1)
      {
       v.push_back(ext_string(last, end()));
       return v;
      }
     }
    }

    if (last != i)
     v.push_back(ext_string(last, i));

    return v;
   }

   /**
    * Convert a string into an integer
    *
    * Convert the initial portion of a string into a signed integer.  Once a non-numeric
    * character is reached, the remainder of @p string is ignored and the integer that was
    * read returned.
    *
    * @param s The string to convert
    * @return The integer converted from @p string
    */
   static long int integer(const string &s)
   {
    long int retval = 0;
    bool neg = false;

    for (const_iterator i = s.begin(); i != s.end(); i++)
    {
     if (i == s.begin())
     {
      if (*i == '-')
      {
       neg = true;
       continue;
      }
      else if (*i == '+')
       continue;
     }
     if (*i >= '0' && *i <= '9')
     {
      retval *= 10;
      retval += *i - '0';
     }
     else
      break;
    }

    if (neg)
     retval *= -1;

    return retval;
   }

   /**
    * Convert the string to an integer
    *
    * Convert the initial portion of the string into a signed integer.  Once a non-numeric
    * character is reached, the remainder of the string is ignored and the integer that had
    * been read thus far is returned.
    *
    * @return The integer converted from the string
    */
   long int integer() const
   {
    return integer(*this);
   }

   /**
    * Split a string into chunks of size @p chunklen.  Returns a vector of strings.
    *
    * Splits a string into chunks of the given size.  The final chunk may not fill its
    * entire allocated number of characters.
    *
    * @param[in] chunklen The number of characters per chunk
    * @return A vector of strings, each of length <= chunklen
    *
    * @section chunk_split-ex Example
    * @code
    * std::ext_string s("abcdefghijk");
    * std::vector<std::ext_string> v = s.chunk_split(3);
    * std::copy(v.begin(), v.end(), ostream_iterator<std::ext_string>(cout, " "));
    *
    * abc def ghi jk
    * @endcode
    */
   vector<ext_string> chunk_split(size_type chunklen) const
   {
    vector<ext_string> retval;
    retval.reserve(size() / chunklen + 1);

    size_type count = 0;
    const_iterator
     i = begin(),
     last = i;
    for (; i != end(); i++, count++)
    {
     if (count == chunklen)
     {
      count = 0;
      retval.push_back(ext_string(last, i));
      last = i;
     }
    }
    
    if (last != i)
     retval.push_back(ext_string(last, i));

    return retval;
   }

   /**
    * Join a sequence of strings by some glue to create a new string
    *
    * Glue is not added to the end of the string.
    *
    * @pre [first, last) is a valid range
    * @pre InputIterator is a model of STL's Input Iterator
    * @pre InputIterator must point to a string type (std::string, std::ext_string, char *)
    *
    * @param[in] glue  The glue to join strings with
    * @param[in] first The beginning of the range to join
    * @param[in] last  The end of the range to join
    * @return A string constructed of each element of the range connected together with @p glue
    *
    * @section join_ex Example
    * @code
    * std::vector<std::ext_string> v;
    * v.push_back("This");
    * v.push_back("is");
    * v.push_back("a");
    * v.push_back("test.");
    * std::cout << std::ext_string::join("|", v.begin(), v.end()) << std::endl;
    *
    * This|is|a|test.
    * @endcode
    */
   template <class InputIterator>
    static ext_string join(const string &glue, InputIterator first, InputIterator last)
    {
     ext_string retval;

     for (; first != last; first++)
     {
      retval.append(*first);
      retval.append(glue);
     }
     retval.erase(retval.length() - glue.length());

     return retval;
    }

   /**
    * Join a sequence of strings by some glue to create a new string
    *
    * @copydoc join
    * @ref join_ex
    */
   template <class InputIterator>
    static ext_string join(value_type glue, InputIterator first, InputIterator last)
    {
     ext_string retval;

     for (; first != last; first++)
     {
      retval.append(*first);
      retval.append(1, glue);
     }
     retval.erase(retval.length() - 1);

     return retval;
    }

   /**
    * Search for any instances of @p needle and replace them with @p s
    *
    * @param[in] needle The string to replace
    * @param[in] s      The replacement string
    * @return    *this
    * @post     All instances of @p needle in the string are replaced with @p s
    *
    * @section replace-ex Example
    * @code
    * std::ext_string s("This is a test.");
    * s.replace("is", "ere");
    * std::cout << s << std::endl;
    *
    * There ere a test.
    * @endcode
    */
   ext_string &replace(const string &needle, const string &s)
   {
    size_type
     lastpos = 0,
     thispos;

    while ((thispos = find(needle, lastpos)) != npos)
    {
     string::replace(thispos, needle.length(), s);
     lastpos = thispos + 1;
    }
    return *this;
   }

   /**
    * Search of any instances of @p needle and replace them with @p c
    *
    * @param[in] needle The character to replace
    * @param[in] c      The replacement character
    * @return           *this
    * @post             All instances of @p needle in the string are replaced with @p c
    *
    * @ref replace-ex
    */
   ext_string &replace(value_type needle, value_type c)
   {
    for (iterator i = begin(); i != end(); i++)
     if (*i == needle)
      *i = c;

    return *this;
   }

   /**
    * Repeat a string @p n times
    *
    * @param[in] n The number of times to repeat the string
    * @return ext_string containing @p n copies of the string
    *
    * @section repeat-ex Example
    * @code
    * std::ext_string s("123");
    * s = s * 3;
    * std::cout << s << std::endl;
    *
    * 123123123
    * @endcode
    */
   ext_string operator*(size_type n)
   {
    ext_string retval;
    for (size_type i = 0; i < n; i++)
     retval.append(*this);

    return retval;
   }

   /**
    * Convert the string to lowercase
    *
    * @return *this
    * @post The string is converted to lowercase
    */
   ext_string &tolower()
   {
    for (iterator i = begin(); i != end(); i++)
     if (*i >= 'A' && *i <= 'Z')
      *i = (*i) + ('a' - 'A');
    return *this;
   }

   /**
    * Convert the string to uppercase
    *
    * @return *this
    * @post The string is converted to uppercase
    */
   ext_string &toupper()
   {
    for (iterator i = begin(); i != end(); i++)
     if (*i >= 'a' && *i <= 'z')
      *i = (*i) - ('a' - 'A');
    return *this;
   }

   /**
    * Count the occurances of @p str in the string.
    *
    * @return The count of substrings @p str in the string
    */
   size_type count(const string &str) const
   {
    size_type
     count = 0,
     last = 0,
     cur = 0;

    while ((cur = find(str, last + 1)) != npos)
    {
     count++;
     last = cur;
    }

    return count;
   }

   /**
    * Determine if the string is alphanumeric
    *
    * @return true if the string contains only characters between a-z, A-Z and 0-9 and
    * contains at least one character, else false
    */
   bool is_alnum() const
   {
    if (length() == 0)
     return false;

    for (const_iterator i = begin(); i != end(); i++)
    {
     if (*i < 'A' || *i > 'Z')
      if (*i < '0' || *i > '9')
       if (*i < 'a' || *i > 'z')
        return false;
    }

    return true;
   }

   /**
    * Determine if the string is alphabetic only
    *
    * @return true of the string contains only characters between a-z and A-Z and contains at
    * least one character, else false
    */
   bool is_alpha() const
   {
    if (length() == 0)
     return false;

    for (const_iterator i = begin(); i != end(); i++)
     if (*i < 'A' || (*i > 'Z' && (*i < 'a' || *i > 'z')))
      return false;

    return true;
   }

   /**
    * Determine if the string is numeric only
    *
    * @return true if the string contains only characters between 0-9 and contains at least
    * one character, else false
    */
   bool is_numeric() const
   {
    if (length() == 0)
     return false;

    for (const_iterator i = begin(); i != end(); i++)
     if (*i < '0' || *i > '9')
      return false;

    return true;
   }

   /**
    * Determine if a string is all lower case
    *
    * @return true if there is at least one character, and all characters are lowercase
    * letters, else false
    */
   bool is_lower() const
   {
    if (length() == 0)
     return false;

    for (const_iterator i = begin(); i != end(); i++)
     if (*i < 'a' || *i < 'z')
      return false;

    return true;
   }

   /**
    * Determine if a string is all upper case
    *
    * @return true if there is at least one character, and all characters are uppercase
    * letters, else false
    */
   bool is_upper() const
   {
    if (length() == 0)
     return false;

    for (const_iterator i = begin(); i != end(); i++)
     if (*i < 'A' || *i > 'Z')
      return false;

    return true;
   }

   /**
    * Swap the case of a string
    *
    * @post Converts all uppercase to lowercase, and all lowercase to uppercase in the string
    * @return *this
    */
   ext_string &swapcase()
   {
    for (iterator i = begin(); i != end(); i++)
     if (*i >= 'A' && *i <= 'Z')
      *i += ('a' - 'A');
     else if (*i >= 'a' && *i <= 'z')
      *i -= ('a' - 'A');
    
    return *this;
   }
 };
}
#endif
也可以找到 http://www.gotroot.ca/ext_string/

posted on 2005-12-13 16:25 梦在天涯 阅读(5638) 评论(1)  编辑 收藏 引用 所属分类: STL/Boost

评论

# re: STL----string 2009-09-16 11:21 egmkang.wang

std::string::data()
结尾不一定有'\0'吧.
c_str是肯定有.

我自己实践的结果是,data()也会返回'\0'结尾的const char*.
如果data()不返回,那么程序可能就会有问题.......  回复  更多评论   


只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理


公告

EMail:itech001#126.com

导航

统计

  • 随笔 - 461
  • 文章 - 4
  • 评论 - 746
  • 引用 - 0

常用链接

随笔分类

随笔档案

收藏夹

Blogs

c#(csharp)

C++(cpp)

Enlish

Forums(bbs)

My self

Often go

Useful Webs

Xml/Uml/html

搜索

  •  

积分与排名

  • 积分 - 1795780
  • 排名 - 5

最新评论

阅读排行榜