When Brook Meets ICE
A Smalltalk about General Computing Platform
Bosch Chou (zhoubo22@hotmail.com)
As we have seen, techniques about distributed communication such as CORBA, DCOM, even JAVA have been used widely at some corners on the earth. All of these could implement purposes such as RPC, distributed computing, and some others applications for business and science.
Let’s have a look at development of hardware on platform of PC. CPU is becoming much faster, and much cheaper than any time before. At the same time, GPU, or more generally, is the card we call Display Adapter. Since 1999, NVIDIA released the new generation graphic card series named Geforce, challenge the performance until now, next year we can buy DX10 cards on the markets. Graphic card could do vertex transform and lighting instead of CPU. It’s a great progress on both CPU and GPU. How to use these rich SIMD resources? We can easily understand why we will focus to GPU.
Calm down, what’s our desire platform?
- Cross Operation System
- Cross Networks
- Cross hardware – This is the key problem I try to solve.
The specialties I showed here, except the last one, most of them had been solved by some current technique. So, how to ? I found 2 treasures. ICE, Internet Communication Engine, is much similar as classic CORBA, but much easily used than CORBA. Brook, from Stanford University, developed for years, designed for GPU stream computing. Both of them have the same usage, a front-compiler, which could translate string-codes to C++ language. Then we can add the .h, .cpp files to our projects, code the interface.
The process how does client pass its call to server as showed below.
- Client pass the data which need to be computed to interface declared both side
- Server receives the data, compute them, pass the results back to client
- Client receives the result, do its work itself continually.
But, the problem is, it’s too kinds of IDL language, one is for internet application, another is for local GPU stream computing. And more, ICE have no stream data property. It sounds like C++ metaprogramming, but it’s quite different from each other. So, does it meaning that we must redefine a new IDL language? Let’s check current tools we have had now.
In fact, the most important is the base model. ICE supports a property called “Sequence”, mapped into STL container of C++. It could be considerate as the base data type in the language we thought should to invert one. When a client sent a request, server accepted, and then the client sent data wrapped in this container which will rebuild in memory of “Server” as texture structure. After server had prepared all the textures contained the data ready to compute, it called API, used the Shaders to computed data. All the progress I have illustrated as follows.
For example, we wrote these IDL sentences.
GPU Interface Foo
{
Add([
in
]
float
a
<>
, [
in
]
float
b
<>
, [
out
]
float
c
<>
)
{
/**/
/*
some stuff
*/
}
}
CPU Interface Bar
{
Add([
in
]
float
a[], [
in
]
float
b[], [
out
]
float
c[])
{
/**/
/*
some stuff
*/
}
}
We declared the 2 interface, attention, the “GPU” and ”CPU” is the key word here, they’are used to mark where the interface is used for, here, one will run on traditional CPU, another will run on GPU.
//
On Server Side
//
verify the validity of data
vector
<
float
>
tex1;
vector
<
float
>
tex2;
vector
<
float
>
result;
Add(tex1,tex2,result)
//
use reference, avoid stack-copy
{
GLfloat
*
Tex1Ptr
=
new
GLfloat[tex1.size()];
/**/
/*
some stuff as above, convert container to texture structure
*/
GLuint hTex1; glGenTextures(
1
,
&
Tex1);
glTexImage2D(
/**/
/**/
,Tex1Ptr);
//
upload the data into memory as texture
glUseProgram(g_hArithmetic);
/**/
/*
Draw something to get all the data out, a rectangle etc.
*/
}
If you’re familiar with GL programming, you will point out, “Why not add glFlush, glSwapBuffer above ? “, in fact that’s the key of my whole article. If we only need 1 + 1, even we do not need GPU. The men are greedy all the time. If we want GPU to compute the π for us, what’s should we do ? Assume, we want to compute π , 16 million digitals, but texture unit of GPU can only hold 4096x4096 floating texture size. When GPU will swap buffer, we must move all the data from framebuffer to disk, save them, then make GPU continue compute data. But How to ? I checked the OpenGL and D3D Manual, found nothing useful. So I thought several way to implement this key problem.
- Next generation hardware architecture, CPU integrates GPU, I think AMD & ATi will do this.
- Improve the current API & Drivers, support operate SIMD register directly.
All I said was above, about a special aspect of distributed computing, about how to use GPU to do compute as CPU. If this can be implemented one day, I think the modern science will be benefited much from this.
Reference:
ICE, Internet Communications Engine, Zeroc,Inc http://www.zeroc.com/
Brook, Stardford University, http://sf.net/projects/brook
NVIDIA Develper Zone, http://developer.nvidia.com/
OpenGL official Site, http://www.opengl.org/
posted on 2006-10-28 11:58
周波 阅读(918)
评论(0) 编辑 收藏 引用 所属分类:
奇思妙想