concentrate on c/c++ related technology

plan,refactor,daily-build, self-discipline,

  C++博客 :: 首页 :: 联系 :: 聚合  :: 管理
  37 Posts :: 1 Stories :: 12 Comments :: 0 Trackbacks

常用链接

留言簿(9)

我参与的团队

搜索

  •  

最新评论

阅读排行榜

评论排行榜

only one vertex shader can be active at one time.
every vertex shader- driven program must run the following steps:
1) check for vertex shader support by checking the D3DCAPS8:VertexShaderVersion field.
D3DVS_VERSION(X,Y) shader x.y.
if(pCaps->VertexShaderVersion < D3DVS_VERSION(1,1))
{
return E_FAIL;
}
here to judge whether the vertex shader is suited for shader1.1.
the vertex shader version is in the D3DCAPS8 structure.
2) declaration of the vertex shader with D3DVSD_* macros, to map vertex buffers streams to input registers.
you must declare a vertex shader before using it,
SetStreamSource: bind a vertex buffer to a device data stream. D3DVSD_STREAM.
D3DVSD_REG:bind a single vertex register to a vertex element/property from vertex stream.
3) setting the vertex constant register with SetVertexShaderConstant.
you fill the vertex shader constant registers with SetVertexShaderConstant, and  get the vertex shader constant registers with GetVertexShaderConstant.
D3DVSD_CONSTANT: used in vertex shader declaration, and it can only be used once.
SetVertexShaderConstant: it can be used in every DrawPrimitive* calls.
4) compile previously written vertex shader with D3DXAssembleShader*.
different instructions include:
add  dest src1 src2  add src1 and src2 together.
dp3  dest src1 src2 dest.x = dest.y = dest.z = dest.w = (src1.x * src2.x ) + (src1.y * src2.y) + (src1.z* src2.z)
dp4  dest src1 src2 dest.w =  (src1.x * src2.x ) + (src1.y * src2.y) + (src1.z* src2.z) +(src1.w* src2.w) and dest.x dest.y, dest.z is not used.
dst dest src1 src2  dest.x = 1; dest.y = src1.y * src2.y;dest.z = src1.z;dest.w = src2.w; it is useful to calculate standard attentuation.
expp dest, src.w float tmp = (float)pow(2, w); WORD tmpd = *(DWORD*)&tmp & 0xffffff00; dest.z = *(float*)&tmpd;
lit dest, src

Calculates lighting coefficients from two dot products and a power.
---------------------------------------------
To calculate the lighting coefficients, set up the registers as shown:

src.x = N*L ; The dot product between normal and direction to light
src.y = N*H ; The dot product between normal and half vector
src.z = ignored ; This value is ignored
src.w = specular power ; The value must be between ?28.0 and 128.0
logp dest src.w 
 float tmp = (float)(log(v)/log(2)); 
 DWORD tmpd = *(DWORD*)&tmp & 0xffffff00; 
 dest.z = *(float*)&tmpd;
mad dest src1 src2 src3 dest = (src1 * src2) + src3
max dest src1 src2 dest = (src1 >= src2)?src1:src2
min dest src1 src2 dest = (src1 < src2)?src1:src2
mov dest, src move
mul dest, src1, src2  set dest to the component by component product of src1 and src2
nop nothing
rcp dest, src.w
if(src.w == 1.0f)
{
  dest.x = dest.y = dest.z = dest.w = 1.0f;
}
else if(src.w == 0)
{
  dest.x = dest.y = dest.z = dest.w = PLUS_INFINITY();
}
else
{
  dest.x = dest.y = dest.z = m_dest.w = 1.0f/src.w;
}
rsq dest, src

reciprocal square root of src
(much more useful than straight 'square root'):

float v = ABSF(src.w);
if(v == 1.0f)
{
  dest.x = dest.y = dest.z = dest.w = 1.0f;
}
else if(v == 0)
{
  dest.x = dest.y = dest.z = dest.w = PLUS_INFINITY();
}
else
{
  v = (float)(1.0f / sqrt(v));
  dest.x = dest.y = dest.z = dest.w = v;
}
sge dest, src1, src2 dest = (src1 >=src2) ? 1 : 0
slt dest, src1, src2 dest = (src1 <src2) ? 1 : 0

The Vertex Shader ALU is a multi-threaded vector processor that operates on quad-float data. It consists of two functional units. The SIMD Vector Unit is responsible for the mov, mul, add, mad, dp3, dp4, dst, min, max, slt and sge instructions. The Special Function Unit is responsible for the rcp, rsq, log, exp and lit instructions.

rsq is used in normalizing vectors to be used in lighting equations.
The exponential instruction expp can be used for fog effects, procedural noise generation.
A log function can be the inverse of a exponential function, means it undoes the operation of the exponential function.

The lit instruction deals by default with directional lights. It calculates the diffuse & specular factors with clamping based on N * L and N * H and the specular power. There is no attenuation involved, but you can use an attenuation level separately with the result of lit by using the dst instruction. This is useful for constructing attenuation factors for point and spot lights.

The min and max instructions allow for clamping and absolute value computation.
Using the Input Registers

The 16 input registers can be accessed by using their names v0 to v15. Typical values provided to the input vertex registers are:

  • Position(x,y,z,w)
  • Diffuse color (r,g,b,a) -> 0.0 to +1.0
  • Specular color (r,g,b,a) -> 0.0 to +1.0
  • Up to 8 Texture coordinates (each as s, t, r, q or u, v , w, q) but normally 4 or 6, dependent on hardware support
  • Fog (f,*,*,*) -> value used in fog equation
  • Point size (p,*,*,*)

The input registers are read-only. Each instruction may access only one vertex input register. unspecified components of the input registers default to 0.0 for the .x, .y, .z and 1.0 for the components w.

all data in an input register remains persistent throughout the vertex shader execution and even longer. that means they retain their data longer than the life-time of a vertex shader, so it is possible to re-use the data of the input registers in the next vertex shader.

Using the Constant Registers

Typical uses for the constant registers include:

  • Matrix data: quad-floats are typically one row of a 4x4 matrix
  • Light characteristics, (position, attenuation etc)
  • Current time
  • Vertex interpolation data
  • Procedural data

the constant registers are read-only from the perspective of the vertex shader, whereas the application can read and write into the constant registers.they can be reused just as input registers.
this allows an application to avoid making redundant SetVertexShaderConstant() calls.
Using the Address Register
you access the address registers with a0 to an(more than one address register should be available in vertex shader versions higher than 1.1)
Using the Temporary Registers
you can access 12 temporary registers using r0 to r11.
each temporary register has single write and triple read access. therefore an instruction could have the same temporary register as a source three times, vertex shaders can not read a value from a temporary register before writing to it. if you try to read a temporary register that was not filled with a value, the API will give you an error messge while creating the vertex shader(CreateVertexShader)
Using the Output Registers
there are up to 13 write-only output registers that can be accessed using the following register names. they are defined as the inputs to the rasterizer and the name of each registers is preceded by a lower case 'o'. the output registers are named to suggest their use by pixel shaders.
every vertex shader must write at least to one component of oPos, or you will get an error message by the assembler.
swizzling and masking
if you use the input, constant and temporary registers as source registers, you can swizzle the .x, .y, .z and .w values independently of each other.
if you use the output and temporary registers as destination registers you can use the .x, .y, .z and .w values as write-masks.
component modifier description
R.[x].[y].[z].[w]     Destination mask
R.xwzy                  source swizzle
- R                        source negation 
Guidelines for writing the vertex shaders
the most important restrictions you should remember when writing vertex shaders are the following:
they must write to at least one component of the output register oPos.
there is a 128 instruction limit
every instruction may souce no more than one constant register,e.g, add r0, c4,c3 will fail.
every instruction may souce no more than one input register, e.g. add r0,v1,v2 will fail.
there are no c-like conditional statements, but you can mimic an instruction of the form r0 = (r1 >= r2) ? r3 : r4 with the sge instruction.
all iterated values transferred out of the vertex shader are clamped to [0..1]
several ways to optimize vertex shaders:
when setting vertex shader constant data, try to set all data in one SetVertexShaderConstant call.
pause and think about using a mov instruction, you may be able to avoid it.
choose instructions that perform multiple operations over instructions that perform single operations.
collapse(remove complex instructions like m4x4 or m3x3 instructions)vertex shaders before thinking about optimizations.
a rule of thumb for load-balancing between the cpu/gpu: many calculations in shaders can be pulled outside and reformulated per-object instead of per-vertex and put into constant    registers. if you are doing some calculation which is per object rather than per vertex, then do it on the cpu and upload it on the vertex shader as a constant, rather than doing it on the GPU.
one of the most interesting methods to optimize methods to optimize your applications bandwidth usage, is the usage of the compressed vertex data.
Compiling a Vertex Shader
Direct3D uses byte-codes, whereas OpenGL implementations parses a string. therefore the Direct3D developer needs to assemble the vertex shader source with an assembler.this might help you find bugs earlier in your development cycle and it also reduces load-time.
three different ways to compile a vertex shader:
write the vertex shader source into a separate ASCII file for example test.vsh and compile it with vertex shader assembler into a binary file, for example test.vso. this file will be opened and read at game start up. this way, not every person will be able to read and modify your vertex shader source.
write the vertex shader source into a separate ASCII file or as a char string into you *.cpp file and compile it "on the fly" while the application starts up with the D3DXAssembleShader*() functions.
write the vertex shader source in an effects file and open this effect file when the application starts up.the vertex shader can be compiled by reading the effect files with D3DXCreateEffectFromFile. it is also possible to pre-compile an effects file. this way, most of the handling of vertex shaders is simplified and handled by the effect file functions.
 
5) Creating a vertex shader handle with CreateVertexShader.
the CreateVertexShader function is used to create and validate a vertex shader.
6) setting a vertex shader with SetVertexShader for a specific object.
you set a vertex shader for a specific object by using SetVertexShader before the DrawPrimitive() call of this object.
vertex shaders are executed with SetVertexShader as many times as there are vertices,.
7) delete a vertex shader with DeleteVertexShader().
when the game shuts down or when the device is changed, the resources taken by the vertex shader must be released. this must be done by calling DeleteVertexShader with the vertex shader handle.

Point light source.
a point light source has color and position within a scene, but no single direction. all light rays originate from one point and illuminate equally in all directions. the intensity of the rays will remain constant regardless of their distance from the point source unless a falloff value is explicitly stated. a point light is useful to simulate light bulb.

to get a wider range of effects a decent attenuation equation is used:
funcAttenuation = 1/A0 + A1 * dL + A2 * dL * dL

posted on 2008-12-09 11:18 jolley 阅读(518) 评论(0)  编辑 收藏 引用

只有注册用户登录后才能发表评论。
网站导航: 博客园   IT新闻   BlogJava   知识库   博问   管理