刘未鹏(pongba) /文
waterwalk /译
C++的罗浮宫(http://blog.csdn.net/pongba)
首先非常感谢waterwalk的辛勤翻译:-) waterwalk把翻译回贴在原文的下面了,为
了方便阅读我提取出来编辑以后重发一个帖子。这篇文章原本是想对最近C/C++争论
系统的整理一下一些思考的,但由于一开始的时候用英文写了两段,后来就干脆都用
英文了,造成很多人阅读的麻烦,在此抱歉。不过好在waterwalk翻译了整篇文章,
于是单独贴在这里:-)
问题
为什么用C++呢? 在你皱着眉头离开之前,试着回答这个简单的问题。效率,是么
?人人都知道这个。但情况是,当一个人开始讨论编程语言或与其相关的话题时,他
必须要非常明确而有针对性。为什么呢?我来问你另一个问题:如果效率是人们使用
C++的唯一理由,那么为啥不直接用C呢?C被认为比C++效率更高(嗯嗯,我知道C没
有比C++的效率高多少,所以这里别误解我的意思,因为即使它们二者效率相同,刚
才的问题依然存在)。
迷思
我知道你又要说“更好的抽象机制”了,因为毕竟C++是要设计成一个更好的C的。
C++没有牺牲效率,同时又添加了这么多高级特性。但问题是,“开发者们真的需要
这些高级特性么?”。毕竟我们一直听人讲KISS之类的东西。我们也都听到有声称C
比C++更KISS所以我们要用C云云。这种持续不断的争论将C与C++之间的比较变成了一
个大大的迷题(或者说是混乱)。令人惊讶的是,貌似的确有很多人更加倾向于用C
,最大的理由就是C++实在是太难用对了。甚至Linus也这么想。
这种现象最大的影响就是当人们在C和C++之间权衡时,使人们倾向于使用C。而且
一旦人们开始用C,他们很快就适应并满足了(其实,在任何语言乃至任何人类活动
中都有此现象,C++亦然,比如常常听到有人说“XX语言我用了这么多年,一直用得
好好的”,照这种说法任何图灵完备的语言还不都是能用来编程?)。于是即使他们
还没有试试C++,或者他们还没成为好的C++程序员时,他们就开始声称C比C++更好了
。然而其实呢,真实的答案往往总是取决于实际情况的。
我说过“取决于实际情况”了么?那到底实际情况是什么呢?显然,有些领域C是
更好的选择。例如设备驱动开发就不需要那些OOP/GP技巧。而只是简单的处理数据,
真正重要的是程序员确切地知道系统是如何运转的,以及他们正在做什么。那么写操
作系统呢?我本人并没有参与任何操作系统的开发,但我读过不少操作系统代码(大
多是unix的)。我的感觉是操作系统很大一部分也不需要OOP/GP。
但是,这就表示在所有效率重要的领域,C都是比C++更好的选择么?未必。
答案
让我们一个一个来分析。
首先,当人们关注效率时,有2种效率——时间效率(例如OS,运行时库,实时应
用程序,high-demanding的系统)和空间效率(例如各种嵌入式系统)。但是,这样
的分类并不能帮我们决定用C还是C++,因为C和C++的时空效率都很高。真正影响选择
语言的因素是业务逻辑(这里的“业务逻辑”并非表示“企业应用业务”)。例如,
使用OOP/GP来表达逻辑(或者说代码的结构)好呢,还是就只用数据和过程好呢?
据此观点,我们可以把应用程序大致分为两类(当然前提是关注的是C/C++而不是
java/C#/ruby/erlang等等):底层应用程序和高层应用程序。这里底层是指像OB/
OO和GP没啥用处的地方, 其余归到高层。显然,在所有C/C++应用的领域(这些领
域需要C/C++的效率),属于高层的应用有很多(可以看看Bjarne Stroustrup在他主
页上的列表)。在这些领域中,抽象至少是和效率一样重要的。而这些正是C++适用
的场合。
等等还有。即使在程序员不需要高级抽象的领域,也不是就绝对用不到C++的。为
啥呢?仅仅是因为你的代码中没有用类或模板并不意味着不能用以类或模板实现的库
。因为有如此众多方便的C++库(还有即将到来的tr1/tr2),我觉得有充分的理由在
这些领域中使用C++——你可以在编码时仅使用C++中的C核心(以任何你喜欢的方式
来KISS),同时还能用强大的C++库(比如STL容器、算法和tr1/tr2的组件)。
最后,我认为人们还常常忽略了一点——有时KISS也是建立在抽象上的。我觉得M
atthew Wilson在他新书《Extended STL,卷1》的序言中对此做了很好的阐释。他
写了2段代码,一段用C,另一段用C++:
// in C
DIR* dir = opendir(".");
if(NULL != dir)
{
struct dirent* de;
for(; NULL != (de = readdir(dir)); )
{
struct stat st;
if( 0 == stat(de->d_name, &st) &&
S_IFREG == (st.st_mode & S_IFMT))
{
remove(de->d_name);
}
}
closedir(dir);
}
// in C++
readdir_sequence entries(".", readdir_sequence::files);
std::for_each(entries.begin(), entries.end(), ::remove);
而在C++09里面更简单:
// in C++09
std::for_each(readdir_sequence(".", readdir_sequence::files), ::remove)
;
也就是说,我认为即使一个人在自己的代码里不需要类或模版,他也有理由用C++
,因为他用的那些方便的C++库用到了类和模板。如果一个高效的容器(或智能指针
)能把你从无聊的手动内存管理中解放出来,为啥还要用那原始的malloc/free呢?
如果一个更好的string类(我可没说std::string,地球人都知道那个不是C++中能做
出的最好的string类)或正则表达式类能把你从一坨一坨的、你看都不想看的处理字
符串的代码中解脱出来,那么为啥还要手动去做这些事呢?如果一个 "transform"(
或"for_each")能够用一行代码把事情漂亮搞定,为啥还要手写一个for循环呢?如
果高阶函数能满足你的需要,那么为啥还要用笨拙的替代方法呢?(OK,我知道,最
后两个需要C++加入lambda支持才真正摆脱鸡肋的骂名——这正是C++0x的任务嘛)
总之,我认为KISS并不等同于“原始”;KISS意味着用最适合的工具来做事情,这
里“最合适”的意思是工具能够帮你以尽量直接简洁的方式来表达思想,同时又不降
低代码的可读性,另外还保持代码容易理解。
真正的问题
人们可能会说,相较于被正确使用而言,C++(远远)更容易被错误使用。而相比
而言,C程序的复杂性更容易管理和控制。在C++中,一个普通程序员很可能会写出一
堆高度耦合的类,很快情况就变得一团糟。但这个其实是另外一个问题。在另一方面
,这种事情也很可能发生在任何一门面向对象语言中,因为总是有程序员在还没弄懂
什么是HAS-A和IS-A之前,就敢于在类上再写类,叠床架屋的一层一层摞上去。他们
学会了在一门特定的语言中如何定义类,如何继承类的语法,然后他们就认为自己已
经掌握了OOP的精髓了。另一方面,这一问题在C++中更为严重,因为C++有如此众多
的偶然复杂性在阻碍设计;而且C++又是如此灵活,很多问题在C++中都有好几种解决
办法(想想那么多的GUI库吧),于是在这些选择中进行权衡本身就成了一个困难。
C++中的非本质复杂性是其历史包袱使然,而C++0x正是要努力消除这些非本质复杂
性(在这方面C++0x的工作的确做得很不错)。对于设计来说,灵活性不是个坏事情
——可以帮助好的设计者作出好的设计。如果有人抱怨说这个太费脑细胞了,那可能
是这个设计者本身的问题,而不能怪语言。可能就不该让他来作设计。如果你担心C
++的高级特性会把你的同事引入歧途,把项目搞砸,那你也许应该制定一份编码标
准并严格推行(或者你也可以遵循C++社群这些年积攒下来的智慧,或者在必要时,
只使用C++中的C或C with class那部分),而不是因为有风险就躲开C++(其实这些
风险可以通过一些政策来避免的),因为那样的话,你就没法用那些C++的库了。
另一方面,其实一个更为重要的问题是一个心理学问题——如果一门语言中存在某
个奇异的特性或旮旯,那么迟早总会有人发现的,总会有人为之吸引的,然后就使人
们从真正有用的事情中分心出来(这有点像Murphy法则),更不用说那些有可能对真
正问题带来(在某种程度上)漂亮的解决方案的语言旮旯了。人们本性上就容易受到
稀有资源的诱惑。奇技淫巧是稀有资源,于是奇技淫巧便容易吸引人们的注意力,更
别说掌握一个技巧还能够让那人在他那圈子里感觉非常牛了。退一万步,你会发现,
即使是一个废柴技巧也能引起人们足够的兴趣来。
C++中有多少阴暗角落呢?C++中又有多少技巧呢?总的来说,C++中,有多少非本
质复杂性呢?(懂一定C++的人一定知道我在说什么)
平心而论,近年来(现代C++中)发现的大多数技巧或(如果你愿意称之为)技术
实际上都是由实际需求驱动的,尤其是需要实现高度灵活而又普遍适用(generic)
的类库 (例如boost中的那些玩意)。而这些技巧也的确(在某种程度上)提供了对
实际问题的漂亮解决方案。让我们来这么想一下,如果你处于一个两难境地:要么用
那些奇技淫巧来做点很有用的东西,要么不做这样其他人也就没得用。你会如何选择
呢?我知道boost的英雄们选择了前者——不管多么困难多么变态多么龌龊,把它做
出来!
但所有这些争论都不能改变一个事实:我们理应享有一个语言,能够让我们用代码
清晰的表达思想。以boost.function/boost.bind/boost.tuple为例,variadic tem
plates可以大大简化这几个库的实现(减至几乎是原先1/10的代码行数),同时代
码也(远远)更加简洁易懂。Auto,initializer-list,rvalue-reference,templ
ate-aliasing,strong-typed enums,delegating-constructors,constexpr,al
ignments,inheriting-constructors,等等等等,所有这些C++0x的特性,都有一
个共同目的——消除语言中多方面的非本质复杂性或语言中的尴尬之处。
正如Bjarne Stroustrup所说,很显然C++太过复杂了,很显然人们被吓坏了,并且
时不时就不用C++了。但“人们需要相对复杂的语言去解决绝对复杂的问 题”。我们
不能通过减少语言特性而使其更加强大。复杂的特性就连模板甚至多继承这样的也是
有用的——如果你正好需要它们,而且如果你极其小心使用,不要搬起石头砸自己的
脚的话。其实在所有C++的复杂性当中,真正阻碍了我们的是“非本质复杂性”(有
人称之为“尴尬之处”),而不是语言所支持的编程范式(其实也就3个而已)。而
这也正是我们应该拥抱C++0x的重要原因,因为C++0x正是要消除那些长期存在的非本
质复杂性,同时也使得那些奇技淫巧不再必要(很显然,目前这些技巧堆积如山,翻
翻那些个C++的书籍,或者瞅瞅boost库,你就知道我在说啥了),这样我们就能够直
观清晰的表达思想。
结论
C++难用,更难用对。所以当你决定用它时,要小心,要时刻牢记自己的需求,所
要达到的目的。这里有一个简单的指南:
我们需要高效率么?
如果需要,那么我们需要抽象么(请仔细思考这一点,因为很难评估使用C++高级特性是否能够抵
消误用这些机制的风险;正确的回答取决于程序员的水平有多高,遵循哪种编码标准
以及编码标准执行得如何,等等)?
如果是,那么用C++吧。如果不是,那么,
我们需要用C++库来简化开发么?
如果是,那就用C++吧。但同时必须时刻牢记你在做什么——如果你的代码不需要
那些“漂亮的”抽象,那就别试图使用以免陷入其中。别只是因为你在.cpp文件中写
代码以及你用的是C++编译器就要用类啊、模板啊这些东西。
如果不是,那就用C,不过你又会想为啥不仅仅使用C++中属于C的那部分核心呢?
还是老原因:人们很容易就陷入到语言的“漂亮”特性中去了,即使他们还不知道这
些特性是否有用。我都记不清有多少次自己写了一大堆的类和继承,到最后反倒要问
自己“要这么些个类和继承做什么呀?”。所以,如果你能坚持只用C++中C或C wit
h class的那部分,并遵循“让简单的事情保持简单”的理念;或者你需要把C代码
迁移到C++中来的话,那么就用C++吧,但要十分小心。另一方面,如果你既不需要抽
象机制,也不需要C++库,因为事情非常简单,不需要方便的组件例如容器和字符串
,或者你已认定C++能够给项目带来的好处微乎其微,不值得为之冒风险,或者干脆
就没那么多人能用好C++,那么可能你还是只用C的好。
底线是:让简单的事情保持简单(但同时也请记住:简单性可以通过使用高级库来
获得);必要时才使用抽象(切记不可滥用;遵循好的设计方法和最佳实践)。
原文:
The Problem
So, why C++? Before you frown and turn away. Just try to answer this simple question.
Efficiency, right? Everybody knows the answer. But as it turned out, when discussing a programming language or everything related to one, one should be very specific. Now why’s that? Let me ask you another question: if efficiency is the only reason people use C++, then why don’t they just use C? C is admittedly more efficient than C++ (yeah, yeah, I know it has been proved that C isn’t to any significant extent more efficient than C++, so don’t get me wrong here, because even if they are equally efficient, the problem still exists).
The Myth
I know you are going to say “better abstraction mechanism”, because after all C++ is designed to be a better C, one that has uncompromised efficiency and yet at the same time has all those fancy high-level features. But then the problem comes down to “does it really matter if the developers need those fancy features?” I mean, after all we all have been hearing voices about KISS and stuff, and we all have heard about the claim that, compared to C++, C is more KISS so we should use C. This unstoppable argument has turned the comparison between C and C++ into a big myth (or maybe a mess). And surprisingly, it seems that many people do incline to C, the reason mostly being that C++ is so hard to use right. Even Linus thinks so, too.
The real serious impact of this phenomenon is that it drives more people to C when they’re weighing their options, be them C and C++; and once they start using C, they will soon get satisfied and comfortable with what suffices, experiencing what is called “satisfaction”. This is when they will come out and claim that C actually is a better choice than C++ even though they didn’t actually try to use C++ or they aren’t adequately good C++ programmers at all. The real answer, however, almost always begins with “it depends”.
So, did I say “it depends”? On what? Obviously there’re some areas where C is a better choice than C++. For instance, device driver development is usually something that doesn’t need fancy OOP/GP techniques. It’s just simple data manipulation; what really matters is the programmers know exactly how the system works, and what they’re doing. Now what about OS development? I’m not a guy who’s been involved in any kind of OS development myself, but having read a fair amount of OS code (Unix mostly), I’ve come to feel that there’s a significant part of the OS development that doesn’t need OOP/GP either.
However, does that mean that, in all those areas where efficiency matters, C is a better choice than C++? Not really.
The Answer
Let’s do this case by case.
First of all, when people are concerned about efficiency, there’re really two kinds of efficiency – time efficiency (e.g. OS, runtime, real-time applications, high-demanding systems) and space efficiency (e.g. all sorts of embedded systems). However, this categorization doesn’t really help us determine whether we should use C or C++, because C and C++ are both extremely efficient as to both time and space. What really affects our language choice (between C and C++, of course) is the business logic (here by “business”, I don’t mean the “enterprise application business”). For example, is it better to use OOP/GP to express the logic or is it better off being kept pretty much just about data and procedures.
From this point of view, we can vaguely divide applications into two categories (of course, with the premise that what we’re concerned with is C/C++, not java/c#/ruby/erlang etc.): low-level applications and high-level applications, where low-level applications means the ones where fancy abstractions such as OB/OOP and GP are pretty much of no use, and high-level means all the rest. Now, obviously, of all the areas where C/C++ is used (because of their high-efficiency), there’re a significant number of “high-level” applications (see those listed on Bjarne Stroustrup’s homepage), where abstraction is just as important as, if not more important than efficiency. And those are precisely the places where C++ is used and useful in a unique sense, and where C++ is a better choice than C.
Wait, there’s more. As it turns out, even in those areas where programmers don’t use high-level abstractions in their code per se, there might be a reason they should use C++, too. Why’s that? Just because your code don’t use class or templates doesn’t mean it doesn’t use a library that does. Considering the availability of all the handy C++ library facilities (with tr1/tr2 coming soon), I think there’s a pretty strong reason to use C++ in these cases - you can stick to the C core of C++ when coding (KISS in any way you want), and at the same time you’ve got some awesome C++ libraries at your disposal (e.g. STL containers and algorithms, tr1/tr2 components, etc.). And finally, there’s this one thing that’s always ignored by many people – sometimes KISS relies on abstractions. I think Matthew Wilson made a crystal clear point about this in the prologue of his new book “Extended STL, Vol 1”, where he laid down two blocks of code, one written in C and one in C++:
// in C
DIR* dir = opendir(".");
if(NULL != dir)
{
struct dirent* de;
for(; NULL != (de = readdir(dir)); )
{
struct stat st;
if( 0 == stat(de->d_name, &st) &&
S_IFREG == (st.st_mode & S_IFMT))
{
remove(de->d_name);
}
}
closedir(dir);
}
// in C++
readdir_sequence entries(".", readdir_sequence::files);
std::for_each(entries.begin(), entries.end(), ::remove);
And it’s even simpler in C++09:
// in C++09
std::for_each(readdir_sequence(".", readdir_sequence::files), ::remove);
I think this is exactly the reason why one should use C++ even in those cases where he doesn’t really need class or templates in his own code – the handy C++ libraries he will find very useful does. Similarly, if an efficient container (or a smart pointer) will save you from all the boring job of manual manipulation of memory, then what’s the point of using the primitive malloc/free? If a better string class (I’m not talking about std::string; everybody knows it’s not the best C++ can do) or regex class can relieve you of all the cluttered string-manipulation code you don’t even want to look at, then what’s the point of doing it manually. If a ‘transform’ (or a ‘for_each’) can do your job in one line so succinctly and clearly (and I know, of course, C++ need lambda function support for those – that’s what C++0x is for), then what’s the point of hand-written for-loops? If high-order function is really what you need, then what’s the point of using awkward workarounds to approach the same deal?
KISS doesn’t mean “primitive”; KISS means using the most suitable tool for your job, where “most suitable” means the tool you use should help you express your mind as straight (and succinct) as possible, as long as it doesn’t compromise the readability and understandability of the code.
The Real Problem
People might say that C++ is much more easily misused than properly-used, and C, on the other hand, is always more manageable and controllable as to complexity. In C++, an average programmer might come up with a whole bunch of highly coupled classes that degenerates fast into a big mess. But this is actually a separate issue. On the one hand, it can pretty much occur in any object oriented language. There’re always programmers who dare to write classes on top of classes even before they have any idea what HAS-A is and what IS-A is; they learn all the syntax of defining a class and inheriting one from another and they thought they’ve grasped the essence of OOP. On the other hand, the reason it appears to be more serious in C++ is because C++ has so many accidental complexities that impede the design, and because C++ is so flexible that pretty much every problem in C++ has several alternative solutions (thinking of all the GUI libraries) so that weighing all the options becomes a hard job itself. The accidental complexities are a historical baggage that C++0x is trying so hard to (and hopefully will) get rid of; the flexibility with respect to design isn’t actually a bad thing if you think about it - it helps good designers make good designs; and if someone blame them for hurting his brain then maybe it’s his problem, not the language’s; maybe he shouldn’t be the one to make a design. And if you’re so worried that your fellow C++ coders will be enticed by fancy high-level features and that your project will eventually get screwed, then maybe what you should do is setting up a coding standard and enforce it (or you can just follow the collective wisdom, or stick to the C core or C with class part of C++ if necessary), not flinching away just because there’re risks (risks that can be avoided by policies), because then you will not be able to access all the C++ libraries anymore, mind you.
On the other hand, there’s this more important psychological problem – if there’s a bizarreness in a language, then eventually someone will find it and people will be attracted by it, and it will draw energy from the main people effort of doing something really useful (It’s kind of like the Murphy's Law), let alone the ones that can lead to an (on some level) elegant solution to a real problem. People are inherently attracted by scarce resources. Corollary: Tricks and bizarrenesses are scarce resources, so they draw people’s attention, not to mention the fact that mastering a trick makes one feel special in the herd. The bottom line is, even useless tricks draw people’s attention so heavily.
How many black corners are there in C++? How many tricks are there in C++? All in all, how many accidental complexities are there in C++?
To be fair, most of the tricks and (you might say) techniques that have been discovered in recent years (i.e. modern C++) are driven by real needs, particularly the needs to implement highly flexible and generic library components (thinking of all the components in boost). And they did lead to (on some level) elegant solutions to real problems. Think about it this way: if you’re put in a place where either you have to use tricks to implement something really useful or you don’t implement it so other people won’t have the benefit of using it. What would you choose? I know that the boost heroes chose the former – implementing them, no matter how hard and tricky and cumbersome the implementation is.
But all those arguments don’t change the fact that we deserve to have a language that supports a clean way to express our minds in code. Take boost.function/boost.bind/boost.tuple for examples, variadic templates will tremendously simplify (by reducing the LOC to nearly 1/10 of the original) the implementation of the three (and many, many more to come) libraries, and the code will become succinct and as simple as possible, too. Auto, initializer-list, rvalue-reference, template-aliasing, strong-typed enums, delegating-constructors, constexpr, alignments, inheriting-constructors, etc; all those C++0x features, they all have one goal – eliminating the various accidental complexities or embarrassments of the language.
As Bjarne Stroustrup said, obviously C++ is too complicated; obviously people get scared and sometimes turn away. But “people need relatively complex language to deal with absolutely complex problems”. We can’t make a language more powerful by taking features away from it. Complex features like templates and even multiple-inheritance can be useful if they’re exactly what you need, you just have to use them very carefully and by necessity so that you don’t shoot yourself in the foot. Of all the complexities in C++, the ones that really get in our way are the accidental complexities (someone might call them “embarrassments”), not the paradigms the language supports (there’re only three). And that’s a very important reason why we should embrace C++0x, because it aims at eliminating the long standing accidental complexities C++ had and make obsolete all the arcane tricks (there’s absolutely huge amount of them out there; check all the C++ books and maybe the boost library and you’ll know what I’m talking about) so that we can express our mind clearly and directly.
The Conclusion
C++ is hard, and even harder to use correctly. So when you decide to use it, be careful, always know where you are and what you really want. Here’s a simple guideline:
Do we need to be efficient?
If so, then
Do we need abstractions in our code (think very carefully on this one, because it’s very hard to estimate whether the benefit of using the high-level features of C++ outweighs the risk of using them incorrectly; the proper answer depends on how well trained your programmers are, what coding standard you follow and how well it’s enforced, etc.)?
If so, then use C++. Otherwise,
Do we need good C++ libraries to ease our job?
If so, then use C++, but meanwhile always remember what you are doing – if your code doesn’t really need all the fancy abstractions, then try not to get sucked into them; don’t use class or templates just because you’re writing code in a .cpp file and using a C++ compiler.
Otherwise, use C, but then you might wonder why not just use the C core of C++. The same reason as always: people get easily sucked into fancy language features even when they don’t really know if they’re going to help – I can’t tell you how many times I wrote a bunch of classes only to find out “what the heck are these classes for?”. So, if you can stick to the C core or C with class part of C++ and keep simple things simple, or if your code needs a migration path from C to C++, use C++ then, but be very careful. On the other hand, if you need neither abstraction mechanisms in your code nor quality C++ libraries because what you’re doing is so simple that you don’t even need convenient components like containers or strings, or you decide that the benefit C++ can bring you in your project is minor to an extent that it’s not even worth taking the risk, or you just simple don’t have enough people that can use C++ in a proper way, then maybe you should stick to C.
The bottom line: keep simple things simple (but remember that simplicity can be achieved by using high-level libraries); use abstractions when necessary (and even then, make spare use of it; follow good design principles and established good practices).