posts - 4,comments - 6,trackbacks - 0

No... In true, PVWp is wrong because P,V and W (as Direct3D defines) were created to satisfy the [row vector]*[matrix] multiplying order. In other words, the content of a transformation matrix could be different depending on the multiplying rule.

For example, consider a translation matrix:

For a [row vector]*[matrix] multiplying order, it is described as:
1 0 0 0
0 1 0 0
0 0 1 0
x y z 1

For a [matrix]*[column vector] multiplying order, it is described as:
1 0 0 x
0 1 0 y
0 0 1 z
0 0 0 1


I don't know the math details you're attempting to work out... I'm really bad at formal math theory. I do however know the D3D details of what's going on. Perhaps if I explain what D3D is doing, it'll help you.

Matrix in memory normally.
11 12 13 14
21 22 23 24
31 32 33 34
41 42 43 44

Normally a vector * matrix such a D3DXMatrixTransform will do:
outx = vec dot (11,21,31,41)
outy = vec dot (12,22,32,42)
outz = vec dot (13,23,33,43)
outw = vec dot (14,24,34,44)

When you give a matrix to a shader, it is transposed, which offers a small optimization for most matrices, which I'll explain in a bit. After it's transposed, it's stored in 4 constant registers (or 3... I'll get to that).

c0 = 11,21,31,41
c1 = 12,22,32,42
c2 = 13,23,33,43
c3 = 14,24,34,44

Next, in the shader performing a "mul(vec,mat)" will do this:
v0 = input register containing position
r0 = temp register
dp4 r0.x, v0, c0 // (r0.x = v0 dot c0)
dp4 r0.y, v0, c1
dp4 r0.z, v0, c2
dp4 r0.w, v0, c3

As you can see, this is the same as D3DXMatrixTransform. Why does D3D perform a hidden transpose? To save precious constant space. You can declare your matrix as float4x3 and the transformation becomes:
dp4 r0.x, v0, c0
dp4 r0.y, v0, c1
dp4 r0.z, v0, c2
mov r0.w, (some constant holding 1)

Any time the matrix isn't a projection, ie: for world, worldview, view, and bones especially, you can drop a constant without affecting the results, as it's always a (0,0,0,1) vector. Back in shader 1.1 with only 96 constants, it was a big deal. If you had 20 bone matrices, that would be either 80 or 60 constants. Personally, I'd take the 60, leaving more room for lights, fog, texture transforms, etc. It also takes time to upload all those useless (0,0,0,1) vectors to the video card, which is another small savings.

posted on 2010-07-20 11:25 MDnullWHO 阅读(502) 评论(0)  编辑 收藏 引用

网站导航: 博客园   IT新闻   BlogJava   博问   Chat2DB   管理