最近在vLan上面鏖战BF2142，着实被这个游戏深深地吸引住，所以就开始关注起BF系列的引擎起来，只知道Script部分是Python完成的。在国外的一个站点上发现了这个小小的访谈，翻译给大家仅供了解。

Continuing our series of occasional interviews with game developers about current and upcoming hardware and game graphics engines, we chat with Marko Kylmamaa, senior graphics programmer for Digital Illusion' Canadian studio.

　　本期的采访对象是来自DICE的高级图像程序员Marko Kylmamaa先生。

FiringSquad: First, Intel and AMD are pushing dual core processors and within the next year four core processors are due to be released. How will DICE support this kind of tech in the Battlefield 2/2142 engine and will there be any need for special programming to fully support multi core CPUs in PCs?

　　提问：目前Intel与AMD力推双核CPU，目前明年都准备推出４核心的CPU。DICE准备如何在BF2引擎中加入对这种技术的支持，如果这样做需要什么特殊的编程技术么？

Marko Kylmamaa: While a program geared towards a single-core machine may run fine, with some exceptions, and perhaps even somewhat faster on a multi-core machine, in order to realize the real performance benefits a careful attention has to be paid into structuring the code for the correct granularity in mind, to make it suitable for multi-core execution. With the introduction of the next generation consoles and the PC hardware, the whole industry is in a learning phase for understanding the differences between the traditional multi-threading approaches, and multi-threading for multiple cores. DICE is working closely with hardware vendors in making sure that all of the future titles make the maximum use of the available multi-core architecture.

　　回答：本来单核心的机器就可以运行得很好，有些时候甚至要快于多核机器。其实问题主要是在多核心的处理比单核心复杂（类似于痛苦的多线程），需要正确的处理代码的结构与处理同步。随着下一代硬件的普及，整个领域开始学习多线程编程技术。DICE也在不断和硬件厂商深入合作发挥多核架构的性能。

FiringSquad: The 64-bit CPU has taken longer to really appear in mainstream PCs than some people expected. Do you think 64-bit CPUs will become more popular and how does DICE support it in their Battlefield 2/2142 engine ?

　　提问：64位CPU的普及速度超过人们的预计到来得如此之快，您认为６４位cpu会流行起来么？DICE在BF2引擎中如何支持它呢？

Marko Kylmamaa: One of the problems with harnessing the full power of 64-bit CPU抯 is the lack of adoption of 64-bit operating systems. Due to this it抯 difficult for the game developers to make full use of the 64-bit execution potential without providing a separate set of executables compiled for the different operating systems. The current Battlefield 2 technology has been thoroughly tested on the 64-bit architecture for guaranteeing a solid performance, and optimizations have been made where possible with such architectures in mind.

　　回答：由于现在64bit操作系统对64位ＣＰＵ的支持不是非常好，所以还无法完全发挥６４位ＣＰＵ的性能。如果不分别的为不同平台编写程序就无法发挥６４位的性能，这是个难点。BF2已经在６４位平台上经过测试与优化过。

FiringSquad: Game physics are getting more and more attention as well with more attention being put into destructible objects and better collisions. Where does DICE stand on this kind of support for its engine and what solution is best; having a dedicated card (AGEIA) using a graphics card (ATI/Havok) or using a CPU to handle it?

　　提问：游戏的物理特性越来越受到重视。DICE如何看待它？您认为哪种方案最好呢？是独立的AGEIA物理卡，还是NV/Havok的图形卡，还是用CPU处理？

Marko Kylmamaa: Especially with multiplayer games in mind, it is difficult to make use of scaleable physics, since especially from the gameplay perspective all of the players must experience the same end result in simulation regardless of their hardware. This leads to a lot of the scalability of the physics being used for visual effects such as richer particle effects or fluid simulation. The GPU can of course be used for offloading the physics simulation from the CPU, but this will compete with the remaining processing time for graphics. Therefore in most cases it is necessary to strike the right balance between the CPU and GPU usage with the needs of the particular game in mind. The next generation technology at DICE is being built on the bleeding edge and will make use of very comprehensive physical modeling.

　　回答：在多人游戏中使用物理特性是相当难做的，从玩家的视角来说，所有的交互角色必须体验到相同的物理特性而不关系他们说使用的是何种硬件。已经使用的物理特性有比如流体模拟粒子系统等等。ＧＰＵ可以分担一些ＣＰＵ的物理模拟计算工作，但是这样就和图形计算争抢了宝贵的资源。虽然如此，我们依旧需要平衡ＣＰＵ和ＧＰＵ之间的负载。DICE将会充分的利用下一代技术为玩家构建最优秀的物理体验。

FiringSquad: HDR lighting is also getting a lot of attention in more PC games. How does the Battlefield 2/2142 engine support those features and how will that help the graphics in games that use it?

　　提问：HDR光照效果也被越来越多的提及。BF2/2142引擎是如何支持这种特效，而且它将如何提升游戏画面呢？

Marko Kylmamaa: HDR lighting can add significantly to the perceived realism in the modern graphics engines. It is becoming an increasingly common feature as the new hardware supports full floating point surfaces and has the required processing power for supporting a multitude of such high end features.
Some aspects of the HDR lighting were simulated especially in the Battlefield 2 Expansion Pack: Special Forces, for adding a degree of realism to the night-time look. The effect is fairly settle and was used mainly for fine tuning the overall look. Battlefield 2142 does not have night-time levels, so the same technology was not applicable to it, however there are a great number of special lighting effects for enhancing the desired futuristic look of the game.

　　回答：HDR光照可以作为现代图形引擎的一个特性。在新硬件完全支持浮点计算的方式下，它可以提高画面质量让它看起来更真实，同时也需要相当的计算量。ｈｄｒ在ｂｆ２特别武力　中被使用，用于夜视效果。BF2142没有夜市场景，所以也就没有使用这种技术（应该是HDR），不过我们使用其他的光照效果提高画面的真实感。

FiringSquad: More and more games are using extensive pixel and vertex shading for visual and art effects. How does the Battlefield 2/2142 engine support these features currently and how will pixel and vertex shaders be used in the future, particularly with Windows Vista and DirectX10 support?

　　提问：越来越过的游戏广泛使用PS及VS技术提高画面质量。BF2/2142的引擎如何支持这些特色，未来PS VS将被如何使用，特别是VISTA和DX10的来临？

Marko Kylmamaa: The Battlefield 2 engine has been built on the DirectX9 architecture and is a fully shader based model. This allowed for a great flexibility during the development, and not supporting the older fixed function pipeline model allowed us to concentrate solely on the high end features. Battlefield 2142 is based on the improved Battlefield 2 technology and will be released later this year, so considering that the DirectX10 hardware won抰 be widely available just yet, it hasn抰 been beneficial to re-architect the engine into a DirectX10 based model for this release. This allowed the available time to be used for adding a number of new special effects and polishing the overall look of the existing engine.

　　回答：目前BF2引擎完全构建于DX9架构，这是个完全基于Shader的模型。这提高了开发的可伸缩性，摆脱了FF管线模型让我们得以实现最高级的特效。BF2142基于改进的BF2引擎技术，不久将发布于世，所以考虑到DX10硬件不会那么快的普及，我们将引擎重新构建以适应DX10的模型。这样我们就有时间在以后的日子里继续加入新的效果，拓展现有的引擎。

FiringSquad: What other advanced hardware and graphical features do you think will be supported in upcoming Battlefield 2/2142 engine games and in future graphics engine?

　　提问：您认为BF2/2142引擎将会支持哪些高级的硬件及其图形技术，未来的引擎呢？

Marko Kylmamaa: Battlefield 2142 will support a large range of high end special effects geared towards creating the desired futuristic look. These involve for example new atmospheric effects for creating a unique look that is quite different from Battlefield 2.

　　回答：BF2142支持许多特效用来构建绚丽真实的图像。比如，球体光照技术（Atomospheric Effect）技术就和BF2中的不同。

FiringSquad: Finally, Mark Rein from Epic has said that Intel is hurting the PC gaming industry through its use of intergrated graphics in PCs. Is this a real threat and if so what can be done about this from the game developer's side?

　　提问：最后，Epic（不要告诉我不知道，即将发布的UT2007）的Mark Rein说，Intel正在通过集成图形硬件损害PC游戏工业。从游戏开发者的角度来说您如何看待这个问题？

Marko Kylmamaa: Intel produces what you could call the ultra-low end graphics cards for a market segment that typically doesn抰 wish to invest the money into a higher end, gaming geared hardware. Clearly there is a demand for this type of hardware as Intel抯 graphics cards boast a large user base. However, this does impose challenges for the games industry in our attempts at reaching especially for the casual gamer market. Hardware requirements for the next generation games keep growing faster than what is needed for running general applications, which increases the rift between the casual and hardcore hardware markets. I believe that we as an industry will also have to recognize the different requirements these markets impose.
From the perspective of a developer, it can be difficult or in some cases practically impossible to make the high-end game run on the ultra-low end hardware. Supporting such scalability range in performance could be prohibitive with the required development time and cost in mind. It is ultimately up to each developer to find the correct range of hardware which allows for the desired market penetration.

　　回答：买Intel的显卡的人，就是那些你称之为买低端货的那些人，他们其实都不会花钱构建一个游戏平台。虽然事实如此，由于这个原因的影响，我们还是不太容易开拓这样的一个市场。游戏对硬件的需求总是要远高于商用软件，其实这也扩大了硬件市场的层次差距。我相信整个工业会对看清楚这个问题。从一个游戏开发者的角度来说，让高端游戏运行在低端平台上着实困难。因为要支持这些性能不一的硬件需要提高开发的时间和花费。更本上还是要开发者根据他们所要开发的市场这一角度进行硬件的平台的选择。

posted @ 2006-11-10 11:44 周波阅读(583) | 评论 (0) | 编辑收藏

啃书记

　　最近找了几本好书，仔细的看了一下，觉得非常好，同大家分享。

　　八十年代访谈录(下载)

　　我生于八十年代后期，没有资格对那个时代进行评论，我们中的大部分还在我们的这个时代里面塑造着自己的未来，纵使很少人知道我们在谎言强权中猪狗不如的生活着，连一个人的姿态都没有。每一次的感叹都在告诉自己，躺下还是前进？也许精通技术很重要，但是用一句古把这个辩解的下一个理由的路封死，“苟利国家生死以，岂因祸福避趋之”。每当我看到如今的中小学生在如此肮脏的环境中成长，我就会安慰自己，一代人自有一代人的事情要做。
　　正是前人，正视自己，正视我们的社会，这个小小的部落。

　　历史的终结(下载)

　　历史是否只是在不断的上演重复的戏剧，飞机大炮代替了长矛大刀，可是屠杀暴政的形态，无论穿着什么样的新衣都是表面现象，人们依旧重复着“兴百姓苦，亡百姓苦”的悲惨命运。是否那种安逸和平富庶是可以一劳永逸的？现在正在朝着哪个方向发展，没有人再去愿意发动一场战争和变革，代价太大，而且充满了危险。可是那样也就意味着，灿烂的历史有可能结束，男人们向往的征战的浪漫有可能就这样埋没在无休止的上班下班里面，成为真正的幻想。从此迎来一个明主自由资本不断上升物质不断丰富的时代，只要还有人从事科技活动。不过也许到那个时刻，也是黑暗与颠覆到来的时刻，我相信。

　　思想的历险，与大师对话(sina在线)

　　我们的国家缺少大师，经济学家很少能够身陷逆境而无所畏惧，音乐家很少能够掌握历史上的所有乐谱，文学家也不再有独特的生活视角。我们缺少的是大师，我们更缺少的事成为大师的毅力和勇气。

posted @ 2006-11-10 10:02 周波阅读(305) | 评论 (0) | 编辑收藏

A Smalltalk about General Computing Platform

When Brook Meets ICE
A Smalltalk about General Computing Platform
Bosch Chou （zhoubo22@hotmail.com）

    As we have seen, techniques about distributed communication such as CORBA, DCOM, even JAVA have been used widely at some corners on the earth. All of these could implement purposes such as RPC, distributed computing, and some others applications for business and science.
     Let’s have a look at development of hardware on platform of PC. CPU is becoming much faster, and much cheaper than any time before. At the same time, GPU, or more generally, is the card we call Display Adapter. Since 1999, NVIDIA released the new generation graphic card series named Geforce, challenge the performance until now, next year we can buy DX10 cards on the markets. Graphic card could do vertex transform and lighting instead of CPU. It’s a great progress on both CPU and GPU. How to use these rich SIMD resources? We can easily understand why we will focus to GPU.
     Calm down, what’s our desire platform?

Cross Operation System
Cross Networks
Cross hardware – This is the key problem I try to solve.

The specialties I showed here, except the last one, most of them had been solved by some current technique. So, how to ? I found 2 treasures. ICE, Internet Communication Engine, is much similar as classic CORBA, but much easily used than CORBA. Brook, from Stanford University, developed for years, designed for GPU stream computing. Both of them have the same usage, a front-compiler, which could translate string-codes to C++ language. Then we can add the .h, .cpp files to our projects, code the interface.
The process how does client pass its call to server as showed below.

Client pass the data which need to be computed to interface declared both side
Server receives the data, compute them, pass the results back to client
Client receives the result, do its work itself continually.

But, the problem is, it’s too kinds of IDL language, one is for internet application, another is for local GPU stream computing. And more, ICE have no stream data property. It sounds like C++ metaprogramming, but it’s quite different from each other. So, does it meaning that we must redefine a new IDL language? Let’s check current tools we have had now.
In fact, the most important is the base model. ICE supports a property called “Sequence”, mapped into STL container of C++. It could be considerate as the base data type in the language we thought should to invert one. When a client sent a request, server accepted, and then the client sent data wrapped in this container which will rebuild in memory of “Server” as texture structure. After server had prepared all the textures contained the data ready to compute, it called API, used the Shaders to computed data. All the progress I have illustrated as follows.
For example, we wrote these IDL sentences.

GPU Interface Foo

{

Add([ in ] float a <> , [ in ] float b <> , [ out ] float c <> ) {

/* some stuff */

}

CPU Interface Bar

{

Add([ in ] float a[], [ in ] float b[], [ out ] float c[]) {

/* some stuff */

}

We declared the 2 interface, attention, the “GPU” and ”CPU” is the key word here, they’are used to mark where the interface is used for, here, one will run on traditional CPU, another will run on GPU.

// On Server Side

// verify the validity of data

vector < float > tex1;

vector < float > tex2;

vector < float > result;

Add(tex1,tex2,result) // use reference, avoid stack-copy

{

GLfloat * Tex1Ptr = new GLfloat[tex1.size()];

/* some stuff as above, convert container to texture structure */

GLuint hTex1; glGenTextures( 1 , & Tex1);

glTexImage2D( /**/ ,Tex1Ptr); // upload the data into memory as texture

glUseProgram(g_hArithmetic);

/* Draw something to get all the data out, a rectangle etc. */

}

If you’re familiar with GL programming, you will point out, “Why not add glFlush, glSwapBuffer above ? “, in fact that’s the key of my whole article. If we only need 1 + 1, even we do not need GPU. The men are greedy all the time. If we want GPU to compute the π for us, what’s should we do ? Assume, we want to compute π , 16 million digitals, but texture unit of GPU can only hold 4096x4096 floating texture size. When GPU will swap buffer, we must move all the data from framebuffer to disk, save them, then make GPU continue compute data. But How to ? I checked the OpenGL and D3D Manual, found nothing useful. So I thought several way to implement this key problem.

Next generation hardware architecture, CPU integrates GPU, I think AMD & ATi will do this.
Improve the current API & Drivers, support operate SIMD register directly.

All I said was above, about a special aspect of distributed computing, about how to use GPU to do compute as CPU. If this can be implemented one day, I think the modern science will be benefited much from this.

Reference:
ICE, Internet Communications Engine, Zeroc,Inc http://www.zeroc.com/
Brook, Stardford University, http://sf.net/projects/brook
NVIDIA Develper Zone, http://developer.nvidia.com/
OpenGL official Site, http://www.opengl.org/

posted @ 2006-10-28 11:58 周波阅读(927) | 评论 (0) | 编辑收藏

GPU Gems3 即将到来

今天去找关于WGL的Specific，想不到看到了GEMS3征稿的消息

http://developer.nvidia.com/object/gpu-gems-3-call-for-participation.html

GPU Gems 3 Call for Participation

Following the success of GPU Gems and GPU Gems 2, NVIDIA has decided to produce a third GPU Gems volume to showcase the best new ideas and techniques for the latest programmable GPUs. We were honored that GPU Gems won the 2004 Game Developer Front Line Award and that GPU Gems 2 was a Finalist in the 2005 Game Developer Front Line Awards. What’s more, GPU Gems and GPU Gems 2 were the best-selling books at the Game Developer Conference and SIGGRAPH in their respective years.

This latest GPU Gems will, like previous volumes, be hardbound and in full color. Tentatively titled GPU Gems 3, it will be edited by Hubert Nguyen, Manager of Developer Education at NVIDIA. Nguyen contributed to previous GPU Gems volumes and brings to this role vast experience in the field of computer graphics. Section editors include a team of expert NVIDIA engineers: Cyril Zeller, Evan Hart, Ignacio Castaño, Kevin Bjorke, Kevin Myers, and Nolan Goodnight.

NVIDIA is looking for innovative ideas from developers who are using GPUs in new ways to create stunning graphics and cutting-edge applications. GPU Gems 3 will present techniques and ideas that are broadly useful to GPU programmers and that can be integrated into their applications. And, it will continue the tradition of featuring chapters exploring non-graphics applications of the computational capabilities of GPU hardware (learn more at www.GPGPU.org). Because our goal is to provide a comprehensive set of authoritative and practical chapters, we strongly suggest submitting ideas about techniques that you have already developed and tested.

If you would like to contribute to the GPU Gems series, please read the following submission guidelines. The deadline for proposal submissions is Monday, December 11, 2006. If your proposal is accepted, you will receive additional time to complete the chapter.

Guidelines for Chapter Proposal

Each chapter proposal should meet the following qualifications:

• Subject. Your chapter can be about any topic related to applying GPUs in useful and compelling ways. For example, you may choose to write about a specific shader or technique for rendering an interesting effect, or you could write about a strategy for integrating shaders into a game engine. Or, you might discuss an interesting way to apply the GPU’s horsepower in a non-graphics area. The main requirement is that your subject has practical value for the community and that you are committed to writing a clear, concise, and informative chapter.

• Submission. Send an e-mail to articlesubmissions@nvidia.com with your proposed chapter title as the subject line, and a concise chapter description in the e-mail body (preferably no more than 300 words). To increase your chances of acceptance, we recommend that the description include screenshots or movies that demonstrate the technique in action. Ultimately, you must be able to provide a working program that demonstrates your technique. Complete source code is not necessarily required, though a self-contained example will be a plus.

• Deadline. We will be working on an aggressive schedule, so you must submit your proposal by Monday, December 11, 2006.

Notifications will be sent out by the end of the year. If your proposal is accepted, we will contact you via e-mail and discuss our expectations for the full chapter, as well as the next steps in the process. To assist you in finalizing your chapter, we will create your figures and provide copyediting services.

Final Chapter Information

• Length. The final chapters should range from five to twenty pages of formatted book pages. This requirement accounts for figures, code samples, and page layout, so there would be approximately 200 to 300 words per page. In some cases, we may accept chapters that are shorter or longer than the suggested length, depending on the content. A chapter does not have to be long or complicated to be accepted. In fact, an idea that is simple and compelling is more likely to be accepted.

• Rights. You must have the right to publish your work, code and images (diagrams and screenshots).

We look forward to reading your submissions.

联想到中国的图书市场，只有叹息而以，什么时候才有这样高素质的图书出现？在国外真好。

posted @ 2006-10-28 11:41 周波阅读(666) | 评论 (0) | 编辑收藏

GPU还可以做什么 —— Brook for GPUs,Stream Computing On GPUs

研究GPGPU也有一段时间了，去年这个时候正在学习GLSL。一段时间前在opengl.org上面发了一个Suggestion，建议GLSL向Cg以及CgFX学学架构，不要这样成对成对的零散使用，虽然说自己可以写class进行封装，可是如果Shader一多管理起来是相当的头疼，应该学学HLSL Cg那样的方式，通过technique与pass的选择进行渲染，在概念上也符合multi-pass。

GPU的SIMD性能超强，比CPU强得太多太多，由此带来异常强悍的浮点运算性能，请看下图。

    画外音：不知道我的6200A排在什么地方哈哈。

    其实上图有偏颇，这张图节选自Siggraph2004，而现在ATi 1800XT的SIMD性能已经超过了6800好多，可不是游戏性能。不过可以看出，比CPU的浮点运算性能高好几倍是不真的事实，可是如何利用呢？

    可编程硬件的到来为我们开了一个好头，也许未来计算机硬件的发展趋势就是，通用计算Generic Computing（GC，自造词汇，可不是垃圾收集）。显卡一直以来都是和Pixel打交道，读取Texel，处理Primitive，写入FrameBuffer，为SIMD的应用打下了坚实的基础。显卡芯片从开始就是并行设计的，这样从纹理单元读取Texel时才能发挥效力，当年大名鼎鼎的Riva TNT2的意思其实是TwiNs Textures双纹理，而不是黄色炸药。Geforce3依靠添加的几个昂贵的register实现了Vertex Programming。NV收购3dfx，推出NV30系列芯片，伴随着DX8为PC机引入Shader，开创PC机图像画质飞跃的先河，如今热门游戏大多数已经使用可编程着色技术用来实现以往在工作站上才能实现的效果，这就是为什么如今看游戏实时演算的画面都比当年Square动用sgi工作站集群渲染出来的FF8动画效果好的原因。其实高级CG图形理论在80年代就已经相当成熟，比如78年的Shadow mapping，White的Ray-tracing等等。那些技术以后我会慢慢给大家介绍，大家不妨去NVIDIA下载一个SDK研究一下，还有MS DX SDK也是必需的。

    先说目前可编程硬件用作通用计算的局限，而且在我看来，这个局限在Vista与DX10流行后可能依旧得不到解决，那就是API的问题。显卡厂商提供的驱动，无一例外的都是彻底为显示服务的，而不是用来标榜自己是GPGPU的。虽然说都有了自己的本地编译器（主要是用于编译GLSL string codes，HLSL可以预先编译好，然后再由驱动载入执行），可是依旧不是为了计算非图形数据服务。于是找到了Sh。Sh是一个很有趣的东西，使用了metaprogramming技术，模拟图形语言的算法，编译的时候转化为对应的低等级ASM语句，很多Graphic Slide里面进行核心算法展示的时候都用的Sh。有兴趣地可以到这里看一下。强烈建议显卡厂商推出可以直接进行计算的驱动，不要和FrameBuffer牵涉，可以直接通过Bus写入内存，技术上并不难，也许是个商业问题。关键时刻永远是商业左右技术的发展，而不是技术人员的一厢情愿就可以左右世界发展，如今已经不是工业革命时代了。

    给大家介绍来自Starford University的Brook（听起来好像广告，不过在Shading Language界可是有Starford Shading Language得一席之地的）。Brook可以理解为是一个C编译器，只不过它编译的不是Bin，而是C++ string codes，而且是着色计算语句数组。比如有这样一段Brook代码，简单的Alpha混合，不对，不像，反正就是它了：

kernel void saxpy(float alpha, float4 x<>, float4 y<>,
out float4 result<>) {
result = (alpha * x) + y;
}

编译成最终的C++代码变成，

static const char* __saxpy_fp30[] = {
"!!FP1.0\n"
"DECLARE alpha;\n"
"TEX R0, f[TEX0].xyxx, TEX0, RECT;\n"
"TEX R1, f[TEX1].xyxx, TEX1, RECT;\n"
"MADR o[COLR], alpha.x, R0, R1;\n"
"END \n"
"##!!BRCC\n"
"##narg:4\n"
"##c:1:alpha\n"
"##s:4:x\n"
"##s:4:y\n"
"##o:4:result\n"
"##workspace:1024\n"
"##!!multipleOutputInfo:0:1:\n"
"",NULL};
void saxpy (const float alpha,const ::brook::stream& x,const ::brook::stream& y,
::brook::stream& result) {
    static const void *__saxpy_fp[] = {"fp30", __saxpy_fp30, "ps20", __saxpy_ps20,
                    "cpu", (void *) __saxpy_cpu, NULL, NULL };
    static __BRTKernel k(__saxpy_fp);
    k->PushConstant(alpha);
    k->PushStream(x);
    k->PushStream(y);
    k->PushOutput(result);
    k->Map();
}

    这不就是纯粹的Shading Language么。不过值得注意的是，Brook通过运行库进行封装，把GPU当作Streaming Processor，由CPU进行控制，计算数据并输出。目前似乎只能进行图形的计算，比如FFT，Ray-Tracing等演示，还没有到达能够计算pi的程度。

    思考了一下。精度问题需要解决，FP16刚刚开始广泛使用，FP32还不能够支持硬件过滤。FP32仅仅只是IEEE754 float的精度而已，更本谈不上double的精度，用在需要精度较高的地方可能还不是很适合。如我设想那样，进行pi的几百万位的计算，目前来说不太可能，首先，Shading Language从来就没有提供地址的操作，也就是无法选泽Pixel的位置，也就是无法对FrameBuffer进行准确定位。如果可以解决这个问题，那么就可以进行真正意义上的通用计算，那个时候FrameBuffer只是一个暂时的缓冲容器而已。

    SIMD的物理计算可以相当的强悍。物理特性计算都是强调同时性的，而GPU可以同时并行计算，充分发挥了自己的优势，难怪NVIDIA要和Havok进行合作。记得以前看过博客园中一位先生写的物理引擎，着实震惊，我建议他不妨研究研究这一块。Stream的概念将在DX10上得到彻底的诠释，不妨看看我以前翻译的DX10文章，其中Geometry Shader很有意思。

    我期待下一代API出现，一个崭新的软硬件组合方案，这样就可能为Display Adapter这个古老的东西带来真正的革命。值得注意的是，AMD已经收购了ATi，而Intel还在为100亿美元收购NV的价格评估的时候，也许下一代变革已经开始了，让我们拭目以待。

    提到的东西可以在这里找到
    Brook http://sourceforge.net/projects/brook
    libSh http://sourceforge.net/projects/libsh

posted @ 2006-10-14 22:21 周波阅读(2567) | 评论 (1) | 编辑收藏

忍耐无奈在大学

    苦难的岁月，空虚的年代，颓废的人生。

    这是最近最经常挂在嘴边的话。

    大二真的和大一不一样了，物是人非的变化太多太多。好多老的朋友都不联系了，美名其曰，他们也有了自己的新朋友。开学搞了一台过时的笔记本带到学校，准备没事好好的把Coding练习下，可是没想到却成为了争先抢夺的游戏机，玩红色警戒2。昨天从楼道走来，发现一个宿舍的四个人，两个人在宿舍里面，两个人蹲在走廊，找个纸箱，把笔记本放在箱子上，一台Acer一台ASUS A8H，在那边联机QQ斗地主不亦乐乎。我无语，我是多么希望我的机器可以跑GL2.0，可是那只是一台P3 933的830M，只支持1.3，这里人家用奢侈品玩游戏。就是这样了，什么人都有。家里人总说我抠门对自己刻薄，我总是觉得钱省着点用好些，说什么我也要把我的MX350焊接好，也不去买早就眼馋的BayerDymnaic。

    恰恰舞真的很有意思，我也第一次从运动中找到了所谓的自信，美女如云，老师身材狂好。可惜我从来就是一个悲观的人，总是喜欢说“即使”“只不过”等等这样的连词。好像从来没有人问过我计算机其他的问题，连电话都没有，手机真正验证了PHILIPS待机王的美誉。眼看着自己一天天的老去，还是没有多大长进，毕业后难道真得要去邦德国人推销木工刀具？还是自己做木材生意？那我还不如现在把俄语学好，然后集资去俄罗斯开发远东森林资源，一年赚个几百万不成问题，可是这样的道路最后可行么？概率太小，挑战太大，说说可以，实践艰难。Blizzard天天招人，Epic上海公司成立，可惜与我无缘，即使准备好了。目前，还算精通C++，理解WIN32的开发，善于设计和编码，不太适合算法的研究，尝试过的内容非常多，从J2EE到NetGrid，COM到ICE，maya到Photoshop，都可以很快的上手，有自信不长的时间精通，希望找个地方练练，南京就可以。

    我们一天的生活如下，吃饭，上课，睡觉，上网，如此循环。终于，我实在忍受不了物理课本的无聊，大喊一声，娘的不如回到19世纪去！拉过被子，睡觉去也，不忘思考设计模式问题。管它下面要干什么，我要的生活其实很简单，做喜欢做的事情。如果可能我情愿做一名记者或者是律师，不需要管这些个无聊的问题，简直就是在浪费我的生命。摇滚明星也可以，成立一支黑暗民谣乐队和欧洲人分庭抗礼。可惜都是胡扯，出了门，就是克扣工资与高额消费，就是高楼与窝棚，就是贫贱与歧视，与自己苦难的命运。

    和女生发信息，说，“你难道不觉得无聊么？”，我回复三个句号，无语，睡觉去也。我是个有轻微精神病的人，我相信这一样。我可以旁若无人的上课时对着自己演讲，大谈美国的历史、拿破仑民法、中国中产阶级的发展、女性解放运动，等到上台演讲的时候却发现结结巴巴什么都说不出来。人家说韦尔奇也是这样，可惜我没有他那么好的家庭条件和家庭教育。大多数人也都是如此。想谈恋爱，甚至有一段时间为了谈而三句不离女人，可惜我终究发现自己是个白痴。等到我现在的名声已经相当出名，恋爱疯子，响彻学院05届的时候，才发现我要的感觉其实还没有到，还远远的没有来到，不过还是挺想念几个女生的。就这样，保持距离的暗暗欣赏，感觉也不错。不过兄弟们说我眼光越来越低，也许和我越来越消沉有关系。钱包因没有女人而越来越鼓，时间因么没有女人而越来越充沛，可是，感情观念因为没有女人而越来越偏颇，自信心因没有女人而越来越低落。有得有失吧。

    如果我要告诫下面的学生，我会说，抓紧时间，把游戏的时间尽量压缩，不要以为三天打鱼两天晒网就可以学到东西，有时候，学习的曲线应该是导数曲线。恋爱要谈，最好早谈，早谈的话很多时间不会在以后关键的时候浪费，这是和世界同步的概念哈哈。

    发点牢骚，美丽和丑陋尽在其中。

posted @ 2006-10-14 17:13 周波阅读(449) | 评论 (4) | 编辑收藏

Wow服务器解析（一）

Wow 服务器解析（一）

最近抽空研究了一下 WOW 的服务器结构，也顺便从那些项目中又复习了一下 ManGOs 中 template 方式下 SingleTon 的使用方法。不过有些不明白的，如果这样， SingleTon<Master> 这样的使用，如果传入的类型不同，难道传出的 static 是一样的？不可能吧，如果打印出 this 指针看看呢？抽空我再试试。 SingleTon 在游戏设计中是相当重要的设计模式，大家一定要好好学习。

认证过程

Wow 的服务器有两部分组成： Logon Server （以下简称 LS ）和 Realm Server （以下简称 RS ）。 LS 接受来自 Wow 客户端的连接，主要有以下几步完成：

检查客户端版本区域等信息，检察账号密码

开始 / 继续传送 Patch （如果有）

与客户端进行 SRP6 的加密会话，把生成的密匙写入数据库

根据客户端请求发送 Realms 列表

当客户端选择好 Realms 后，客户端就从 LS 断开，连接到 RS 上：

认证，使用刚才生成的客户端密匙

如通过，进行游戏循环的交互

RS 和 LS 使用相同的数据库， SRP6 密匙被 LS 生成并写入 DB 后还要由 RS 读取出来进行下一步的认证。

Logon Server 详解

基本的连接过程如下：

客户端准备连接，发送 CMD_AUTH_LOGON_CHALLENGE 数据包，包含了所有登陆所需要的数据比如用户名密码等

服务端返回 CMD_AUTH_LOGON_CHALLENGE 数据包，填充字段包括有效验证，以及计算好的服务端 SRP6 数据

如果有效，客户端发送 CMD_AUTH_LOGON_PROOF 数据包，并把自己计算的 SRP6 数据填充进去

服务端进行验证，发送回 CMD_AUTH_LOGON_PROOF ，包含了 SRP6 验证的结果

如果一切正常，客户端发送 CMD_REALM_LIST 数据包，请求发送有效的 Realm

服务器回复 CMD_REALM_LIST 数据报，并填充过客户端需要的 Realm 数据

客户端的 Realm 列表每隔 3-4 秒就会从服务器端刷新一次。

这个 SPR6 是一种什么样的加密手段呢？以前我也没有用过，看得最多的是 MD5SHA 等 hash 算法。 SPR 算法吸取了 EKE 类型算法的优点进行了改进，非常适合于网络的认证服务，如果我没有记错， J2EE 包含了这个算法的实现。下面简单介绍一下 SRP6a 运作机制，原文见这里。

N N = 2q + 1 ， q 是一个素数，下面所有的取模运算都和这个 N 有关

g 一个 N 的模数，应该是 2 个巨大的素数乘得来

k k = H(N,G) 在 SRP6 中 k = 3

s User’s Salt

I 用户名

p 明文密码

H() 单向 hash 函数

^ 求幂运算

u 随机数

a,b 保密的临时数字

A,B 公开的临时数字

x 私有密匙（从 p 和 s 计算得来）

v 密码验证数字

其中 x = H(s,p) 和 v = g ^ x ， s 是随机选择的， v 用来将来验证密码。

主机将 { I,s,v } 存入数据库。认证的过程如下：

客户向主机发送 I ， A = g ^ a （ a 是一个随机数）

主机向客户发送 s ， B = kv + g^b （发送 salt ， b 是一个随机数字）

双方同时计算 u = H(A,B)

客户计算机算 x = H(s,p) （开始 hash 密码）， S = ((B - kg^x) ^ (a + ux) ) ， K = H(S) ，（开始计算会话 Key ）

主机计算 S = (Av^u)^b ， K = H(S) ，也生成会话 Key

为了完成认证，双方交换 Key ，各自进行如下的计算：

客户接收到来自主机的 key 后，计算 H(A,M,K)

同理，主机计算 M = H(H(N) xor H(g), H(I), s, A, B, K) ，验证是否合自己储存的数值匹配。至此完成验证过程。

三、 Realm Server 详解

从 LS 断开后，开始和 RS 认证：

连接到 RS ，向服务器发送 SMSG_AUTH_CHALLENGE 数据包，包含上次所用的随机种子

服务器发送回 SMSG_AUTH_CHALLENG 。客户端从服务器端发送回来的种子和 SRP6 数据中产生随机种子，生成 SHA1 字符串，用这些数据生成 CMSG_AUITH_SESSION 数据包，发送给服务端。

需要注意的是，这个过程是没有经过加密的。当服务端收到认证回复后，通过客户端产生的种子也生成一个 SHA1 串和来自客户端的进行对比，如果相同，一切 OK 。

下面看一下对账号创建的角色等操作进行分析。一个账号最多可以建 50 个角色吧，我还没有玩过，只是看了一下 Manual 。

客户端发送一个CMSG_CHAR_ENUM数据包请求接受角色

服务端发送回包含所有角色信息的 CMSG_CHAR_ENUM 数据包

这里客户端可以对这些角色进行操作了， CMSG_CHAR_CREATE ， CMSG_CHAR_DELETE ， CMSG_CHAR_PLAYER_LOGIN

角色登陆完成后，服务器发送回 SMSG_CHAR_DATA 数据包

在游戏循环中是如何操作的呢？

如果玩家立刻退出游戏，那么客户端发送 CMSG_PLAYER_LOGOUT ，服务器回复 SMSG_LOGOUT_COMPLETE

如果玩家选择稍后退出游戏，发送 CMSG_LOGOUT_REQUEST 。服务端回复 SMSG_LOGOUT_RESPONSE 。如果玩家在倒计时阶段退出，发送 CMSG_PLAYER_LOGOUT ，那么玩家的角色依旧等倒计时完成后再退出。

如果玩家中断了退出继续游戏，发送 CMSG_LOGOUT_CANCEL ，服务器回复 SMSG_LOGOUT_CANCEL_ACK 。

posted @ 2006-10-14 16:27 周波阅读(5288) | 评论 (3) | 编辑收藏

World Of Warcraft Server Source Topic

声明：World Of Warcraft，魔兽世界相关程序的源代码所有权归暴雪公司Blizzard所有。WowWow只是一个Wow的服务器端的模拟程序，由俄罗斯黑客逆向工程得来，在这里仅供学习网络游戏服务器端或者交流之用，没有任何来自于暴雪公司或者及其中国运行商九城的源代码。任何个人或者组织使用此源代码经营可能违反法律的事业活动与本人无关。特此声明。

讨厌中国的这些个破网站，下载源代码竟然还要花钱申请什么破VIP，殊不知sf.net中好的代码多的是。

这个是我从国外的一个论坛中拖回来的，由于自己的硬盘装不下Wow客户端所以也就没有测试过，有条件的可以试试看。

我打算花些时间用C++重新写一遍，虽然说已经有了类似的Mangos，实在不喜欢C# JAVA之类的虚拟机语言。.net人不要跳出来和我争C#不是虚拟机软件云云，懒得搭理。编译出来的代码很小，程序启动速度奇慢无比，还必须要.net Frameworks的支持，麻烦。

最早的是WowEmu，许多单机版Wow附带的也就是这个我就不列出地址了，BT上多的是。

然后就是Wowwow，可是它的内核代码是不公开的，你可以看到decompiler云云
下载地址
附上一个有一些代码的Wowwow Alpha v8.3
下载地址

目前我正在分析的是Mangos，老巢竟然在sf.net中，介绍是一点没有提到World of War，可是实际上它运行的就是它。
去这里吧

欢迎交流，如果您觉得好请回复我一下谢谢咯~~~

posted @ 2006-10-05 13:59 周波阅读(2273) | 评论 (9) | 编辑收藏

曾经深爱的你远在他乡祝你幸福快乐

    请允许我再叫你一声亲爱的，因为我害怕转身后就把你从此遗忘。
    我们没有联系过了，就上一次我发错了信息，把我的真心实感的负面说给了你听。我知道你当时一定以为，原来我这么几年都是在欺骗你，而你是绝对不可愚弄的人。就这样我们从此就没有再联系过，你也一次没有回复我。我也不想再多说什么，你的选择你自己清楚，我也只是个普通人。我已经对你没有任何感觉了，无奈之至的抉择。如果为别人，我也许还有新的希望。希望你可以过得很好很好，祝福来自一个爱过你的人。

        月石

    冰雪消融的季节，
    草种醒来不住啼哭。

    相爱与沉默的选择，
    未曾在一起肩负过。

    衰老的人悲痛依旧，
    寒冷根植秋的寥落。

    一生只为别人忧伤，
    散曲终了繁花喑落。

    今夜里你泪流为谁，
    只凭明月相思千里。

    送给天下所有孤单的人，祝你们合家幸福，有情人终成眷属。

posted @ 2006-10-02 13:54 周波阅读(313) | 评论 (0) | 编辑收藏

一瞥美国的精英教育

    最近真是不好意思，开学实在太忙，文章都存在笔记本电脑里面，学校上网非常不方面，PocketPC转换有非常麻烦，所以就今天一股脑献上拙文。十一还要出去，偏偏又下雨，好不爽。

    中国家长对美国式的家庭教育总是向往不已，18岁把孩子扫地出门就不用管了，最好过个几十年还能够混出类似于卡耐基、盖茨一样的人物出来，一句话，生儿子就要生这样的，才叫做值得。或者是从小好好学习，上大学考到博士后，最后出国留洋，不奢望富可敌国，也求有房有车中产阶级。

    可是是事实是这样么？高中的时代很羡慕美国的SAT考试制度，一年可以考多次，可以取最好的成绩。听起来好似机会很多，竞争开放，其实不然。在美国也是高手如云，优秀到另中国的“状元”们汗颜的高中生多如牛毛。每年申请“常春藤”系列学校，比如哈佛耶鲁普林斯顿等等世界一流高校的学生，30%学生的SAT分数高的吓人，满分1600，有些甚至能够达到1560分，接近满分。而且，为了申请这些学校，学生无不需要一场完美的面试，许多能够表现个人能力经验的材料。为了准备上大学，他们需要准备的材料，超过我们想象，不仅仅是类似于我们高中三年的拼搏，背诵多少试题，最后在独木桥上杀落千军万马神勇的进入大学。他们更多是从小的教育环境中取得超过别人的能力，因为中产阶级如果想真正的进入上流社会，在如今只有通过教育，通过从小烧钱学习钢琴、吉他、舞蹈等等，从小从内心以及情操塑造一个有资格和一流对手竞争的全能型弄潮儿。而申请这些世界一流高校的学生大多来自家底殷实的中高产阶级，和欧洲那样的“贵族”世袭概念不同，在美国，这种老子英雄儿好汉是通过教育实现的，布什父子就是一个连本拉登都知道的例子，老子有了本事，就可以让儿子上最好的大学，接受最好的教育，结识社会的高层家庭。正如前面所说，这是通过从小的家庭熏陶，超过年龄承受能力的高强度教育训练实现的，而且中西文化在这一点出奇的吻合，没有刻苦哪有成功。

    不妨这样想，中国的学生说，我曾经阅读过多少本C++书籍，曾经写过多少软件，获得过什么样的奖项，仅此而已。而美国的学生表示，自己曾经暑假去AT&T实习，和多少世界一流的专家交流，而且把打工赚来的钱，到秘鲁从事公益活动，为一个学校义务的教授计算机课程，受到当地的嘉奖等等。如果你是Microsoft的HR，你会选择怎样的学生加入？可以想象这样的差距。或者说就是，我们在培养世界工厂的普通工人，而他们在课堂上灌输的是如何引领世界的思想，以及如何获得超越常人的能力和卓越的创造力。虽说如此，这种栽培精英的方式，其实也对广大中下阶级的心理产生反作用，而且相当明显。美国的州立普通大学不乏一流的教学设施，颇有建树的专家教授，但是学生大多数来自下层家庭，他们更多关注的是如何像麦当劳收银员那样可以挣到钱，至于是抢银行还是老老实实的作汽车修理工是没有想太多的。我想包括我，大多数的中国大学生在这个阶段也都是这样想，羡慕同伴打工挣的零用钱，买自己喜欢的东西，毕业后找个好工作就可以了。这些学生的出路也可以想象，也许有一些可以最终突破重重艰险进入高层社会，可是他们中的绝大部分将要继续的生活在社会的最底层。中高层家庭出生的学生，教育优势从出生下来就确立无疑，可以承受得起各种辅导班学习班出国交流费用等等。

    不过还有一个相当有趣的问题，在我们的印象里面，好像美国的一流大学比如哈佛耶鲁，总是在比如经济法律等文科方面建树颇丰，而在一些基础的学科比如数学物理方面则不是那么非常的出色。学校的性质不同是一个方面，老牌的理工学院比如加州理工麻省理工本来就非常强势，另外一个重要的方面就是培养人才的对象不同，也就是说我的学校是专门培养这种精英式的人才而不是单纯的技术人员。有一个很有意思的论点来自《国家的兴衰探源》，就是，普通公民研究社会问题是没有任何价值的浪费时间，因为他们缺少专业的知识来源，以及提供交流场所的其他个体，换句话说，研究这种东西对他们本身没有任何的益处。换句话说就是，这里是独立于社会而又高于社会的一个团体，它的存在职责就是为了培养领导这个社会的人物。而我们的北大也是文科非常出名的学校一样，相反专业理科学校风头反而不是那么强盛。这难道也是所谓的“劳心着治人”？也许有一些这样的取向，不过经济法律等等方面，教导人的是一种思维和态度，这才是最重要的。由此看来，盖茨的父母并不是理工科出生，而是法律工作者，造就了他善于谈判交际如合理性的思考问题也无不关系。就好像那些父母要求子女去学习钢琴而不是计算机，需要积淀的是一种人格的姿态和气质，而不是先天造就一个只懂得技术的不晓得如何看待大局的腐儒。

    眼看着自己也奔三进了大学，也被忽悠浪费了不少青春。心里想着和这些一流的白人比较高下，终究没有机会也没有实力。人类的进化已经许多万年，智力水平已经趋于平均 —— 先天或者医疗水准造就的不算，真正决定自己实力以及所能上升的位置从大方向上已经确立。也许我们需要的是为自己的后代想想应该如何成长如何生活，自己所真正需要的。和我们竞争的就是太平洋彼岸的那些新教徒移民的后裔，而不仅仅是邻居家考上南大的一介书生。

    对于现在我来说，技术也许重要，爱情也许迫切，但是我开始觉得，关注真正的世界才是我们最迫切需要仔细思考的。你觉得呢？Young Guys？

posted @ 2006-09-30 13:52 周波阅读(275) | 评论 (0) | 编辑收藏

仅列出标题

2025年4月

日

一

二

三

四

五

六

周波 87年出生南京林业大学05421班242信箱专业木材科学与工程工业装备与过程自动化迁移到 jedimaster(dot)cnblogs(dot)com

GPU Gems 3 Call for Participation

Guidelines for Chapter Proposal

Final Chapter Information

常用链接

留言簿(4)

随笔分类

随笔档案

新闻档案

同学们Blog

搜索

积分与排名

最新评论

阅读排行榜