http://devgurus.amd.com/thread/158655
What does ALUBusy in APP profiler really mean?
If there is branching in a kernel, the SIMD unit will run multiple passes to cover all cases. The ALUs are actually used less efficiently. Does ALUBusy reflect this?
__kernel void MyKernel()
{
float x = 0, y = 0;
if (get_global_id(0) % 2 = 0)
x ++;
else
y ++;
}
Should APP Profiler show ALUBusy is 100% or 50%?
Thank you!
ALUBusy measures the percentage of GPU time ALU instructions are processed. There are many reasons for a low ALUBusy number, for example, not enough active wavefront to hide instruction latency or heavy memory access.
Code divergence can be measured with VALUUtilization counter if you have SI hardware.
Hi,
ALUBusy is a very important factor for program optimization.
In cases of branching, some ALUs are "busy" with useless errand. I wonder if the useless job is counted as ALUBusy.
Some branchings are inevitable. If a low ALUBusy is caused by these branchings, I will not try to optimize the program more.
Any suggestions will be appreciated.
Vis Cocoa
Vis,
Did you find the answer? I am wondering about the same thing.
NURBS.
Hi NURBS,
I think Lihan Bin gave us the correct answer.
BTW: I ever worked on NURBS for a while :-) Defining a beautiful surface with a few control points. That's really cool!
Vis Cocoa
ALUBusy measures the percentage of GPU time ALU instructions are processed. There are many reasons for a low ALUBusy number, for example, not enough active wavefront to hide instruction latency or heavy memory access.
Code divergence can be measured with VALUUtilization counter if you have SI hardware.
Hi Lihan Bin,
Thank you very much for you answer.
I need a clear definition of ALUBusy. According to your comment, ineffeciency caused by branching is not counted. Even if only one thread in a group is doing useful work, the ALU is 100% busy (at the moment).
I think divergence is a very important factor for kernel tune-up.
I did not find VALUUtilization in App Profiler. What is SI hardware?
Thank you again and have a great weekend!
Vis Cocoa
SI is the short form for Southern Islands, the newest GPU architectur of AMD. SI is also known as the HD79x0, HD78x0 and HD77x0 series for the desktop. For the mobile chips you have to look.